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Summary. Seasonality (or periodicity) and trend are features describing an observed se- 
quence, and extracting these features is an important issue in many scientific fields. However, 
it is not an easy task for existing methods to analyze simultaneously the trend and dynamics 
of the seasonality such as time-varying frequency and amplitude; and the adaptivityoi the 
analysis to such dynamics and robustness to heteroscedastic, dependent errors are not guar- 
anteed. These tasks become even more challenging when there exist multiple seasonal com- 
ponents. We propose a nonparametric model to describe the dynamics of multi-component 
seasonality, and investigate the recently developed Synchrosqueezing transform (SST) in ex- 
tracting these features from the observations, in presence of a trend and heteroscedastic, 
dependent errors. The identifiability problem of the nonparametric model is studied, and the 
adaptivity and robustness properties of the SST are theoretically justified in both discrete- 
and continuous-time settings. Consequently we have a new technique for de-coupling the 
trend, seasonality and non-stationary error process in a general nonparametric setup. Re- 
sults of a series of simulations are provided; the incidence time series of varicella and herpes 
zoster in Taiwan are studied, and we show scientifically the dynamical seasonality and trend 
introduced by the general application of varicella vaccine. 

Keywords: ARMA errors, Continuous-time ARMA processes, Cycles, Non-stationary pro- 
cesses, periodic functions, Synchrosqueezing transform. Instantaneous frequency Time- 
frequency analysis 



1 . Introduction 



Seasonality (or periodicity) is a phenomenon commonly observed in a time series, for ex- 
ample, the incidence time series of a disease. Several properties of a disease fluctuate cor- 
responding to seasons or other stereotyped calendar periods, like the incidences, spreading 
rates, the occurrence, morbidities or mortalities, etc. For example, the following diseases 



(2011), epidemics of influenza Nunes et al. (2011), etc. The seasonality of these diseases 



are known to exert seasonality with high peak in winter: cardiovascular disease Ishikawa 
et al. (2012), depression Kim et al. (201 1[), suicide Messias et al. (2010), asthma Lin et al. 



leads to higher mortality and high demand of medical resource in every winter season, 
and hence a good understanding of the seasonality of a disease is important in both the 
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clinical science and the public health Stone et al. (2007). Trend is another phenomenon 
commonly observed in the time series. For example, to determine if a general application 
of vaccine is effective in the society, we may like to investigate the overall trend of the 
incidence time series and decide if it has changed. 

There are abundant modern methods available to detect both seasonality and trend of 
the time series, for example, seasonal autoregressive integrated moving average (SARIMA) 



Brockwell and Davis (2002 ) and Trigonometric Box-Cox transform, ARMA errors, Trend 



and Seasonal components (TEATS) |De Livera et al. (2011 ), just to mention but a few. 
Although many of the existing models have been widely applied in many fields, they have 
some limitations. 

First, it is hard for the methods to capture some types of non-linear behavior of the 
system, like the diminishment or changes in the seasonal component. To be more precise, 
the seasonal pattern, including the period and strength, may vary according to time. We 
shall call this non-linear behavior of the system the dynamics of the seasonality. An explicit 
example which we will discuss in Section[5]is how the general application of varicella vaccine 
influences the seasonality of the disease. It is expected that the globally applied vaccine 
prophylactic treatment disturbs the seasonal behavior of the varicella incidence, and the 
trend of the incidence may be changed as well. However, showing these effects scientifically 
is not guaranteed by existing methods. Indeed, the global (parametric) model assumptions 
on the seasonality they make are often too restrictive for real world data such as the 
varicella incidence time series shown in Figure [7] More precisely, violations of the global 
model assumptions can cause large biases in the seasonality estimation. Furthermore, 
when the parametric assumptions on the seasonality are in doubt, the quality of the trend 
estimation can be significantly affected, as any unexplained seasonal dynamics would then 
have to be attributed to the trend, resulting in spurious oscillations in the trend estimate. 
This has been an important issue in analysis of trend and seasonality. Another limitation 
follows immediately from the first one. Since the conventional methods are built upon 
some global model assumptions, the seasonality analysis depends on the whole time series, 
rendering the methods sensitive to the length of the time series. For example, the analysis 
result obtained from a 10-year time series may be different from that obtained from a 5- 
year sub-series. Moreover, there may exist multiple seasonal components, which cannot be 
handled by most of the existing methods. Above all, the error process in the time series is 
often dependent and may be heteroscedastic, making the problem even more complicated 
and challenging. 

Notice that seasonality, or more generally periodicity, and trend phenomena are not 
unique to disease incidence processes. For example, we find oscillatory pattern in the 
commonly observed electrocardiogram signal and the respiratory signal, to name but a 
few. It has been well known that the oscillatory pattern contains plentiful information 
about the underlying physiological dynamics, see for example Wu (2012) and the refer- 
ences therein. Further examples in astronomy, climatology and econometrics have been 
extensively discussed, see for example Hall et al. (2000); Nott and Dunsmuir (2002 ); |0h 
et al. (2004]); [Genton and Hall (200"7| ); [Park et al. (2011[ ); [Rosen et al. (2009[ ); |Pollock 



(2009); Bickel et al. (2008), among others. Thus, if we want to understand more accu- 
rately about the dynamics of a system, it is desirable to develop a new nonparametric 
method that enjoys the adaptivity property to the system dynamics and the robustness 
property to heteroscedastic, dependent errors. 

To tackle the above mentioned difficulties faced by existing methods, in this paper we 
introduce a phenomenological nonparametric model which captures and offers a natural 
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decomposition of the dynamical multi-component seasonality (or periodicity) , the changing 
trend, and the heteroscedastic, dependent errors. Then, we focus our attention on a data- 



adaptive algorithm, referred to as Synchrosqueezing transform (SST) Daubechies and Maes 
|(1996[ ); [Paubechies et al. (2010 ), to isolate the meaningful seasonal component based on 
noisy observations coming from the new model. Specifically, in the proposed seasonality 
model, we allow both the amplitude and the frequency in each of the components to 
vary with time slowly. Intuitively, we would run into the identifiability problem with too 
general a nonparametric seasonality model. We deal with this problem by first restricting 
to the functional class in which, for each of the seasonal components, both the time- varying 
amplitude and time- varying frequency have bounded above first derivatives. Then we prove 
that any two different representations of a member of the class are the same up to a model 
bias of controlled order. In this sense, we say that functions in the class are identifiable 
(up to the model bias). To the best of our knowledge, this identifiability result is so far the 
first theoretical justification for nonparametric modeling of multi-component dynamical 
seasonality. It is important in its own right. Furthermore, we prove that the SST method 
provides not only an adaptive and robust estimator for the seasonality, but also an easy 
visualization of the seasonal components so that we can determine if any hazard occurs in 
disease incidence, even in the presence trend and dependent errors and in both discrete- and 
continuous-time settings. In addition, since the seasonality is modeled nonparametrically 
and the methodology is local in nature, the SST algorithm is insensitive to the length 
of the observed time series, in the sense that the seasonality estimator obtained from a 
time series does not change much as time goes, even when the seasonality is dynamical. 
In the new model, the trend is characterized as a smooth, oscillatory function with "very 
low-frequency," and its extraction is also provided. An important consequence of our 
nonparametric approach to seasonality and trend estimation is that the non-stationary 
error process can be accurately approximated by the residuals obtained by subtracting 
from the time series the trend and seasonality estimates. We demonstrate the applicability 
of the model and method by analyzing a series of simulated examples and incidence time 
series of varicella and herpes zoster extracted from the Taiwan's National Health Insurance 
Research Database (NHIRD), published by the National Health Research Institute in 
Taipei. 

Here we point out three important properties of the proposed methodology for trend 
and seasonality estimation, besides those mentioned before. First, since our model allows 
multiple periodic components, it can pick up both the high-frequency seasonality and the 
low-frequency cycles at once, without knowing in advance the time-varying periods. The 
only requirements are the sampling interval between two successive observations is small 
enough for us to observe the high-frequency periodicity, and the length of the time series 
is large enough for us to see the low-frequency cycles. Second, we show that the SST 
algorithm, originally designed to analyze dynamical seasonality without contamination of 
noise or couphng with trend (Daubechies and Maes (1996); [Daubechies et al. (2010 )), 
can estimate accurately the dynamical periodic component from the time series in the 
presence of trend and general dependent errors. After the periodic component is isolated 
from the time series, we can conduct further investigations on the trend and the error 
process separately, which are relevant in many contexts including forecasting. Third, the 
assumptions we make on the heteroscedastic, dependent error process are mild and are only 
on the power spectrum of the stationary component and smoothness of the modulating 
variance function, and that can be relaxed as evidenced by our numerical study reported 
in Section m 
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In summary, both the nonparametric model we study and the theoretical results pre- 
sented in this paper are new and they have important implications on further studies 
on seasonality, trend and dependent error. In particular, as far as we are aware of, the 
functional classes for the seasonal component have not been considered before and the 
identifiability theory has not been studied before; the smooth trend component has not 
been taken into account in previous papers devoted to studying SST in analysis of sea- 
sonality paubechies and Maes (1996[);|Daubechies et al. (2010[ ); [wr(2012[ ); [Thakur et al 



|(2012[ )). In addition, although Thakur et al. (2012) considered a random error term they 
limited it to Gaussian white noise with the noise level being much smaller than the error 
in modeling the seasonality. By contrast, we allow the error process to be heteroscedastic 
and dependent, require much milder conditions on the variance function, and do not spec- 
ify the distribution of the stationary process but just impose mild conditions on its power 
spectrum. Also, we give the formulae for the power spectrum of a general order continuous- 
time ARMA process so as to argue that it fulfills the required conditions. Moreover, while 
all of the previous works investigate properties of SST in the continuous-time setup, we 
address the problems in both continuous- and discrete-time setups, as the latter is equally 
relevant from the modeling viewpoint and is somehow more important from the sampling 
viewpoint i.e. in practice we can only observe the process at discrete sampling time-points. 

This paper is organized as follows. In Section [2] both continuous- and discrete-time 
models are proposed to model processes that contain trend, seasonality and dependent 
error, functional classes used to model dynamical, multi-component seasonality are intro- 
duced, and identifiability theory of the functional classes is provided. The Synchrosqueez- 
ing transform to separate the seasonality, the trend and error is introduced and studied 
theoretically in Section [3] In Section |4] we examine finite sample performance of the SST 
and compare it with the TBATS model |De Livera et al. (2011 ) via simulation, and in 
Section [5] we report a study on the varicella and herpes zoster incidence time series using 
TBATS, and the SST and trend extraction techniques. Section [6] contains discussions and 
open problems. Proofs of the theoretical results and some background material can be 
found in the Supplementary. 



2. Model 

Oscillatory signals are ubiquitous in many scientific fields. Seasonality, the term wildly 
used in the public health, economics, etc, describes the oscillatory behavior of a given 
time series Yt- Here we list some interesting problems commonly raised in analyzing the 
oscillatory signals. 

Ql: If there are multiple oscillatory components inside the signal, how to detect and 
estimate them? 

Q2: If there exists a trend in addition to the oscillatory components, how to extract it? 
Q3: If the pattern of the oscillatory components is time- varying, how to quantify/identify 
it? 

Q4: Since the length of the observed data elongates as time goes, how sensitive is the 

estimator to the length of the observed time series? 
Q5: If the errors across different time points are dependent, or if the variance of the error 

changes according to time, is the estimator robust to such dependent, heteroscedastic 

errors? 
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2. 1. Some related approaches 

In this subsection, we briefly review some existing methods directly related to our ap- 
proach, and make clear how we generalize the existing methods to answer the above 
questions simultaneously. 



2.1.1. Trigonometric Seasonality and Trend Model 
A simply-minded model for the seasonality reads: 

Yt^ft+Tt + ^t, 

where ft is a deterministic, periodic function modeling the seasonality, Tt is a determin- 
istic function modeling the trend, and $t is a stationary random process modeling the 
dependent errors. In the above model, ft is usually taken as a trigonometric function: 

K 

A = ^^fcCOs(27rai), (1) 
fe=i 

where K E N, and for each k — 1, . . . , K , Ak > and £,k > 0. We call Ak the amplitude, 
^kt the phase function, and ^k the frequency of the k-th seasonal component. A special 
case, which consists of single- component seasonality i.e. K = 1 and a linear trend, was 



considered in Pollock (2009). 



2.1.2. BATS, TBATS 

To resolve the limitations of the Seasonal Autoregressive Integrated Moving Average 
(SARIMA) model ,Brockwell and Davis (2002] ), and to improve the traditional single sea- 



sonal exponential smoothing methods, recently, in De Livera et al. (2011) two algorithms 
BATS and TBATS were introduced. In particular, the BATS model includes a Box-Cox 
transformation, ARMA errors, and n seasonal patterns as follows: 



yi 



(c^) _ / (i^" - when io^O 



log Yf when uj = 



y[''^ ^lt-i+4>bt-i+Y.sfl^^+dt 

i=l 

(t = it-1 + 4>bt-i + adt 
bt = {1 - (l))b + (f>bt^i + I3dt 

s^'^ - s^'^ + -y d, 

where w G M is the Box-Cox transform parameter, mi, . . . , m„ denote the constont seasonal 
periods, b is the long-run trend, {c?t} is an ARMA(p, q) process with Gaussian white noise 
innovation process with zero mean and constant variance, and for t = 1, . . . ,T, it is the 
local stochastic level, bt is the short-term trend and si is the stochastic level of the z-th 
seasonal component. Note that, while both the periodic component and trend in model 
|l| are deterministic, in the BATS (and TBATS) model both sj'-* and It are coupled with 
the single-source error dt. 
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Furthermore, De Livera et al. (2011) introduced the trigonometric representation of 
the seasonal components based on Fourier series: 



ki 



(i) 



3 = 1 
(i) 



Sin \f +j['^dt 



_i sm A'- + Sjj._i cos A' + % dt 



(2) 

(3) 
(4) 



where 7^*^ and 72*^ are smoothing parameters, A^*' 



2TTj /rrii, 3*^1^ is the stochastic growth 
in the level of the «-th seasonal component, and ki is the number of harmonic components 
needed for the i-th seasonal component. The TBATS model is thus defined by replacing 
Sj'-* in the BATS model by I'M . Note that we can rewrite Js^ and in the complex form: 



(0 iB'-'l 

rjj.e J.' — e 



(5) 



(i) i9^'^ 

where r) ie 3.* 



cos A,- 



sy; + is*/f \ e*^j is viewed as a rotational transformation with matrix 



form 



sin A 



(i) 



sin A 
cos A 



(0 



and c^') 



7i ■ 



«72 • 



Thus, (2) models the i-th seasonal 



3 3 

component as the real part of ki complex harmonic functions"Note that the "frequency" 
Xj is fixed all the time. We will come back to this issue later. 

We can view BATS/TBATS as a model "decoupling" the seasonality and the trend 
which are modeled together in SARIMA. In TBATS the decoupled seasonality is modeled 
by introducing trigonometric functions. We refer to De Livera et al. (2011 1 for detailed 
discussion on the BATS/TBATS models. 



2.1.3. Limitations 

As useful as the above models are, there are some limitations. In particular, questions Q3, 
Q4 and Q5 cannot be fully answered so far. First of all, the parameters in the models are 
all fixed and are determined from the entire observed time series, which means that the 
seasonality pattern is fixed and the question Q3 cannot be answered properly. Moreover, 
since the parameter estimation depends on the whole observed time series, the length of 
the observed time series plays a role in the result, thus it is not easy to answer Q4. For 
example, the parameters estimates based on a 10-year time series may be different from 
those estimated based on a 5-year sub-series. In addition, although dependent errors can 
be handled, it is not guaranteed for heteroscedastic errors. In this paper we focus on 
relieving these limitations as well as providing an alternative approach in order to answer 
properly all of the questions Ql - Q5. 



2.2. New Models 

To answer the questions Q1-Q5 properly, we consider the following phenomenological 
model generalizing model ([T]). First, consider the periodic function satisfying the following 
format: 

K 

f{t) = Y,Mt)cos{2TTMt)), (6) 

k=l 
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where Ak{t) , (^'/.{t) > for all i <E M, k = 1,...,K. We call Ak{t) the instantaneous 
amplitude, (j)k{t) the phase function and (/)'j,(t) the instantaneous frequency of the fc-th 
component in /. The mathematical rigor of the expression ^ is discussed in Section 2.3 



The seasonality model ([6]) is parallel to that in ([!]) - the notion "instantaneous fre- 
quency" and "instantaneous amplitude" in (|6| is parallel to the "frequency" and "am- 
plitude" in ([T]). The time- varying nature of <^5,(<) and Akit) allows us to capture the 
momentary behavior of the system. To be more precise, in model ([T]), the physical in- 
terpretation of frequency is how many oscillations are generated in a unit time period, 
while in model ([6]), the physical interpretation of the time- varying positive function </>^(t) 
is how many oscillations are generated in an infinitesimal time period. Similar argument 
applies to the instantaneous amplitude function Ak{t), which describes the "instantaneous 
amplitude" of the oscillation. Note that the frequency of the fc-th harmonic function in 
([T]), ^fe, is equivalent to the derivative of ^fei, where > can be interpreted as a constant 
function defined on K. 

In general, the expression ^ is not unique and the identifiability issue exists. In 



Genton and Hall (2007), which studies single-compenoent seasonality without trend, this 
issue is noticed and both Ai{t) and 0i(i) are modeled by some parametric forms to avoid 
the identiability problem. However, in general, parametric assumptions are restrictive 
and often need to be validated using nonparametric model-checking methods, which are 
unfortunately unavailable. By contrast, we consider the following functional classes in 
which minimal, nonparametric assumptions are imposed on the instantaneous amplitude 
functions and instantaneous frequency functions. 

Definition 2.1 (Intrinsic Mode Functions class A'^^^''^'^). For fixed choices of Q < 
e ^ 1 and < ci < C2 < oo so that ci — 0{1) and ci — 0(1), the space y^^^'"^^ of Intrinsic 
Mode Functions (IMFs) consists of functions / : K -)> C, / G C^(E) n L°°(E) having the 
form 

f{t) = A{t)cos{27T^{t)), (7) 
where A and (j) satisfy the following conditions for all t G M; 

C^(M)nL°°(M), MA{t)>ci, supA{t)<c2, (8) 

cj) e C^{R), inf (t>'{t) > ci, sup(f>'{t) < C2, (9) 

\A'{t)\<ecl,'{t), \r{t)\<ecl,\t). (10) 

Definition 2.2 (Superpositions of IMFs). Fix d> 0. The space A^/f^ of super- 
positions of IMFs consists of functions f having the form 

K 



k=l 



for some finite K > and for each li — 1, . . . , K , fk{t) — Ak{t) cos.{2t: (j)k(t)) G A'^^^''^^ such 
that (j)k satisfies 



m) > 4>'k-iit) and ^Ui)-'^LiW>rfK-W + 0fc-iW]- 



(11) 
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Intuitively, a signal in the A'^^''^'^ class is a single-component, periodic function having slow 
varying instantaneous amplitude and instantaneous frequency. And, the A^^f^ functional 



class models signals having multiple oscillatory components. The condition (11 1 is needed 
due to the dyadic separation nature of the continuous wavelet transform. In addition, 
Ak{t) and (pkif) together characterize the dynamics of the /c-th seasonal component. Note 

The identifiability theory for functions 



that both and A'^^^^'^ are not vector spaces. 



belonging to A^ 



and 



is given in Section 



2.3 



We then model a random process Y{t) with rnuIFi-component seasonality and trend 



behaviors contaminated by heteroscedastic, dependent errors as below: 

Y{t) = f[t) + T[t) + am{t). 



(12) 



where the seasonality fit) — ^^j^ ylfc(t) cos(27r(/)fc(t)) is in y^^^'^^ when K — 1 and in 
•^Td^ when K > 1, the trend T{t) is modeled as a C°° function so that its Fourier trans- 
form, which is a distribution in general, is compactly suppor ted on (— ]^^ Ci, jgj^ci), 



is some stationary generalized random process (GRP), see Gel'fand and Vilenkin (1964[ ), 
and a{t) > so that a g D L°° is a real-valued smooth function used to model the 
heteroscedasticity of the error term. For example, $(i) can be taken as a continuous-time 
autoregressive moving average (CARMA) random process of order {p,q), where p,q > 0, 
defined in Section [8.5.21 of the Supplementary. The heteroscedastic, dependent error pro- 
cess a{t)^{t) specified here is a special case of the locally stationary processes introduced 



Priestley (1965 ), and fitting the time- varying spectra has been considered (Dahlhaus 
(1997[ ); |Hallin (1978[ p80| ; [Rosen et al. (2009[ )). Note that in our model the trend T{t) 



can be regarded as a nonparametric, "very low-frequency" oscillatory signal. Combining 
these with the nonparametric modeling of the dynamical seasonality via the functional 
classes A'^^''^^ and A'^^f '^, model ([l2| provides a general decomposition of the trend, the 
seasonality and the error process. rF enables us to extract information for all of the three 
terms based on observations on Y{t), which we will discuss in Section |3] With the above 



interpretation of model (12), the quantities Ak{t) and (j)'f.{t), k = 1, . . . together are 
surrogates of the dynamics of the seasonality and T{t) models the trend. The problem 
is now reduced to estimating Ak{t), 4>k{t), <i''ki^) a-nd T{t) from the observations Y{t) 
generated by model (12). 

(112 



In practice, we can only access the continuous-time process Y{t), given in model 
on discrete sampling time-points nr, where n ^ 'L and r > is the sampling interval. 
In this case, denote the observations on Y(t) by Y := {y„ = Y{nT)}n£Z- Then we can 



deduce from model ( 12 ) the following discrete-time model 



Yn = /(nr) -t- T{nT) + a{nT) $„, n G Z, 



(13) 



where $„ = $(rtT), n G Z, is a stationary time series, which can be taken as, for example, 
an ARMA time series discussed in Section |S.5.1| of the Supplementary. We can interpret 
(13) as a model for a discrete-time process {Yn}n<£Z in which the deterministic seasonality 
plus trend signal is contaminated by the heteroscedastic errors a{nT) $„, n ^'L. 



Compared with the existing models like SARIMA and TBATS, our models (12) and 



( 13 ) not only can take care of the problem of multiple seasonal components with non- 
integer periods, but also can cope with the system dynamics, including the unknown 
time-varying frequencies and time-varying amplitude modulations. In the TBATS model, 
although the amplitude modulation ([5| and the dynamics of the seasonality ([2]) may change 
according to time, the "changes" in the seasonality and trend are coupled with the ARMA 
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error term, thus both the seasonaHty and the trend are stochastic. On the other hand, 
in our new models (12) and (13), both of the seasonahty and the trend are modeled as 
deterministic terms which are independent of the error term. Hence the TBATS model is 
essentially different from ours. 



2.3. Identifiability of functions in Af , A\]^^ 

It is well known that for a given function there might be more than one representation. 
For example, a purely harmonic function can also be represented as a function having 
time-varying amplitude and time-varying phase: 

cos(27ri) = (1 + a{t)) cos(27r(i 

where h' {£) and a(t) might be "large" compared with 1. Which of the two representations 
is "good" depends on the problem, and different representations lead to different interpre- 
tations. Thus, we start from asking the following question: 

Q: given a function f{t) ~ A{t) cos(2Tr(p(t)) G A^^'''^, how much the different rep- 
resentations in A'^^''^^ for f can differ from each other? 

This is the identifiability problem we face when we introduce the A^^ '^^ functional class. 
In the following theorem we claim that the instantaneous amplitude function A(t), the 
instantaneous frequency function 4>'{t) and the phase function 0(t) in all the different rep- 
resentations of a function in A'^^''^^ can only differ up to a smooth, small model bias of 
order e. In this sense we say that a function in A^^''^^ is identifiable up to a model bias 
of order e. Also, in this sense we have a mathematically rigorous meaning of the "instan- 
taneous frequency" and the "instantaneous amplitude" when we work with the functional 
class v4Ji''^^. The proof of this theorem is postponed to the Supplementary. 

Theorem 2.1 (Identifiability of single-component seasonality). Suppose that 
a{t) cos (t>{t) e A^^''^'^ can be represented in a different form which is also inA'^^''^^, that is, 

a(t) cos 0(0 = [a{t) + I3{t)] cos[0(t) + a{t)] e A'i^'"^. (14) 

Define tm & rn £ Ij, so that 0(t,„) = (m + l/2)7r and s,„ G M, m G Z so that 
</)(s™) = nm. Then a G C^{R), (5 G C2(K), a(t„) = Vm, ^(s„) > Vm and /3(s„) = 
if and only if a{sm) — 0. Moreover, we have \oi'{t)\ < Sire, \a{t)\ < and \(3(t)\ < Aire 
for all i G M. 



Theorem 2.1 consists of two conclusions. The first conclusion is that the perturbations 
a and /3 must have some "hinge points" and so they are restrictive. This property comes 
from the positivity condition of the instantaneous frequency and instantaneous amplitude 
functions. The second conclusion is that the absolute values of a, a' and (3 cannot be large. 
This property comes from the "slowly varying" conditions of the A'^^''^^ functional class 



and the existence of the hinge points. Theorem 2.1 has its own interest and further study 
on this topic is beyond the scope of this paper. Similarly, the identifiability issue exists for 
functions in the class A"^^^^ , and in the following theorem we describe the identifiability 
theory for A'l^^^ . From the theorem, we conclude that any multi-component periodic 
function in A't^f '^ is again identifiable up to a model bias of order e. 
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Theorem 2.2 (Identifiability of multiple-component seasonality). Suppose 
f{t) = C0S(/!)i(t) G vA^^^"^^ can be represented in a different form wtiicti is also in 

■^Td^^ ' ^^^^ 

N M 
Z=l 1 = 1 

thenM = TV, |0,(t)-$,(t)| < 2E„,e, \(l)[{h) - %{b)\ < 44E„,e and \aiit)~Ai{t)\ < 2c2E,r,e 
for all I — 1, ... , N , wtiere Em > is a finite universal constant depending on c\, Ci and 
d. 



3. Method and Theory 



We need a method to analyze observations generated from models (12 1 and (131 so as to 
extract information for the trend T{t) and the dynamics of the seasonality, Ak{t), 4>k{t), 



and 4''j^{t), k = 1,...,K. Time-frequency (TF) analysis Flandrin (19991 is commonly 



applied to analyze the signal expres sed in (p| . Reassignment appro ach, studied in |Flandrin 
|(1999P ; [Auger and Flandrin (1995[ ); |Chassa nde-Mo ttin et al. (1997[[2053l ), is a technique in 
TF analysis aimed at giving a more accurate estimate of 0'^ (t) from the TF representation 
provided by, e.g., short time Fourier transform (STFT) or continuous wavelet transform 
(CWT). However, the estimation of Ak{t), the reconstruction of each component fk{t) and 
the robustness to noise are not guaranteed in general. 

In this paper, we consider a newly developed reassignment method referred to as the 
Synchrosqueezing transform (SST), which was introduced to study dynamical seasonality 



without presence of trend or random errors (Daubechies and Maes (1996); Daubechies 



et al. (2010)). We focus on the theoretical framework and show the capability of SST to 



accurately estimate the functions 0fe(i), 4''ki'^) A^it) when the trend is present, and its 
robustness to heteroscedastic, dependent random error processes, for which the data are 
modeled by (12 1 or (13). Before stating the SST algorithm, in the following subsection we 



introduce some notation first. 



3. 1. Notation and Background Material 

Denote by S the Schwartz space and let S' be its dual (the tempered distribution space). 
When g ^ S' and h € S, g{h) means g acting on h. Here, sometimes people use the 
notation g(h) :— J ghdt which is consistent with the case when g is an integrable function. 
Given a function h E S, its Fourier transform is defined as /i(^) :— /i(t)e~'^'^^*dt. The 

Fourier transform of g G S' is thus defined by g{h) := g{h). 

Take ^ S. For fc G N U {0}, define the following abbreviations: 

^kkj^^^'^ (^) ' ^^J'Hx) := 

where -0'^'^^ is the fc-th derivative of V'j a > and 6 G M. Recall that the continuous wavelet 



transform (CWT) Daubechies (19921 of a given f{t) G S' is defined by 



Wf{a,b) = / /(t)V'a,6(i)di, (15) 
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where a > and 6 G K. Here we follow the convention in the wavelet literature that ip is 
called the mother wavelet, and a means scale and b means time. To ease the notation, the 
moments of ip are denoted as /f^ = ^ \x\'\ilj^''\x)\dx for fc = 0, 1, . . .. 

Let W be the standard Brownian motion and D be the differentiation operator in the 
general sense. Then DW is the Gaussian white noise. Denote W^""^ := D"W, n > 1, is 
the n-th differentiation of the standard Brownian motion in the general sense. Recall that 



DW is a special case of generalized random process (GRP), see Gel'fand and Vilenkin 
|(1964[ ). Fix a GRP $. When ip £ S, $('0) is understood as a random variable modeling 
the measurement of $ when it is characterized by the measurement function ijj. 

Next we recall the notion of power spectrum of a stationary GRP $. The correlation 
functional of <I>, denoted as 5$, is: 



:=E[*(0)<i>W], (16) 



where 0,-0 € 5 Gel'fand and Vilenkin (1964). Then, there exists a functional Bq so that 



i?<,(0,V) = (i?o,0*V'*), (17) 
where ★ stands for convolution. Here, Bq is a generalized function of one variable which 



is the Fourier transform of some positive tempered measure (Gel'fand and Vilenkin, 1964 



Equation 3 above Theorem 1 in Chapter HI). Moreover, by Theorem 1 in (Gel'fand and 



Vilenkin, 1964 Chapter HI), we have 



i3$(0,0)= / mmdriiO (18) 



where rj is the unique positive tempered measure associated with <I> so that rj = Bq. In 
general, we call drj the power spectrum of the GRP $. Notice that in the special case of 
DW, di]{(_) = d£_. Thus, the variance of $(V'a), where ^a{t) ■— 7^V'(f)i a > 0, is simply: 



YaviHlPa)) = E$(V'a)$(0a) = J MOMOda{0 = a|| V-^) II £2 (r^^) . (19) 

It is clear that the variance of ^{ipa) depends on the scaling a and on the power spectrum. 
In the special case of DW, the variance does not depend on the scale a since dry(^) — d^. 



3.2. Synchrosqueezing transform 

In this subsection, we briefly recall the main idea of reallocation methods, introduce the 
synchrosqueezing transform (SST) algorithm, state its theoretical properties, and then 
summarize its advantages over other methods, especially for our purpose, the seasonality 
analysis. 

Take a TF representation, denoted as Rf : {t, ^) e C, determined by f{t) G A'^^^^ 

based on a TF analysis, for example, STFT or CWT. The reassignment methods "sharpen" 
Rf{t,£^) by "re-allocating" the value at to a different point (t',C') according to some 
reassignment rules (Flandrin (19991). 

The SST algorithm, which is a special case of the reassignment method tailored to 
analyze a clean function /(t) S ■^Tf^ without noise, is composed of three steps. First, 

choose the mother wavelet ^ G S so that supp V C [1 — A, 1 + A], where A ^ 1, and 
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calculate W/(a, 6), the CWT of f{t) as given in (15). Second, we calculate the function 



w/(a, h) defined on IR+ x M, which plays the role of the reassignment rule: 

^ I oo when \Wf{a,b)\ =0. ^ ^ 

By its definition, LOf{a,b) contains abundant information about the instantaneous fre- 
quency functions in /. Indeed, when / is a purely harmonic function, ujf{a,b) takes on 



the value of the frequency of / if it is finite. We refer to |Daubechies et al. (2010 ) for the 



details. Third, the SST of f{t) is defined by re-assigning the TF representation Wf{a,b) 
according to the reassignment rule ujf(a,b): 

S]it,0 '^n-n I K{\ujf{a,t)-£,\)Wf{a,t)a-^''^da (21) 

Q— >-0 J 

{(a,t): |W/(a,t)|>7} 

where {t,C) G M x M+, a,7 > 0, ha{t) = ^h{t/a), h e i^(IR), and ha ^ S weakly when 
a — with S denoting the Dirac delta function. Thus, at each time point t, Sj{t, ^) collects 
all CWT coefficients with scales a at which the CWT detects a seasonal component with 



frequency close to ^. As we will see in Theorcm |3.1[ according to the reassignment rule ( 20 1 , 
Sj{t,^) will only have dominant values around (t)'i^{t), which allows us an accurate estimate 
of (p'l^it) numerically. After an estimator for (/)'j,(t) has been obtained, to reconstruct the 
k-th component fk{t) — Ak{t) cos(2TT(j)k(t)) in /, its amplitude modulation Ak{t) and 
phase 4>k{t), we resort to the reconstruction formula of CWT and consider the following 
estimators: 

Mt) := 3?/^(t), (22) 

where TZ^ := J ^^^dC, X denotes the indicator function, 3? means taking the real part, 
and r > is chosen by the user, 

and an estimator for 4>k{t), denoted by <pkit), can then be obtained by unwrapping the 



phase of the complex signal "j^^- We mention that the reconstruction formulae (|22|) is 



slightly different from that in Estimate 3.9 in Daubechies et al. (2010). These formula are 



actually equivalent, as is shown in the proof of Estimate 3.9 in Daubechies et al. (2010[ ) 



In practice we find that (22) performs slightly better numerically. Moreover, it can be 
applied to other time-frequency analysis techniques which provide accurate instantaneous 
frequency estimation. Thus we suggest it as our reconstruction formulae. 

Here we summarize how we numerically implement SST based on discretization. We 



refer the readers to Thakur et al. (2012) for further details of implementing SST and 
its application to paleoclimatic data. Given a time series Yn without noise, so that 
Yn = /{nr) + T{nT), n G 1, . . . ,N, where r > is the sampling interval. Denote the 
numerical implementation of CWT based on Y :— {Yn}^^^ by an x ria matrix 
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with the discretization interval in log2 (a) . Also, denote the numerical implementation 

~ 1_ 

of SST based on Y by an x matrix 5^, where — [ J is the number of 

the discretization of the frequency domain by equally spaced intervals of length 



= -j^. In the formula (21) and (22 1, the number T plays the role of a thresholding 
parameter. When the random error is Gaussian white noise, we may follow the suggestion 



provided in Thakur et al. (2012[ ) to choose F. 



With the implemented CWT and SST, we first estimate 0'^, by fitting a discretized 
curve c* € •Z'^, where Z„j, = {1, . . . to the dominant area of Sy by maximizing the 

following functional: 




\S^{c{m),m)\ ^ 



-A^ |c(TO)-c(m-l)|^ 



where the user-defined parameter A determines the smoothness of the resulting curve 
estimate. Then we define 

Mnr) :=^, n=l,...,N. 

With <j)'f.{nT), n — 1, . . . ,N, the fc-th seasonal component fk{t) is thus reconstructed as 
below: 

1 ^ 



A 

■_! 1 



fk{nT)^^f^inT), n^l,...,N. (23) 
With we get Ak := \f^\ and (f>k as the unwrapped real function of A 



When there is no noise and no trend in Y{t), it has been shown in Daubechies et al. 
|(2010[ ) that the SST algorithm can help us to complete the mission. However, in practice, 
the observed signals are contaminated by noise and infiuenced by the trend. We can for 
sure analyze the noisy signal using the above SST algorithm. But, the question is whether 
or not the noise and trend will deteriorate the performance of SST, leading to erroneous 
results. In Section [373} we show that SST is robust to noise to some extent. Moreover, we 
show that the SST algorithm allows us to extract the trend accurately when we replace the 
clean signals Yn = finr) + T{nT), n — 1, . . . ,N, in the above algorithm with observations 



generated from model ([12j) or ( |13[ ). The analysis summarized in Section 3.3 suggests that 
we can estimate the trend T{nT) by 



1 

(l~A)ci 
L (l-l-A)Aa J 



r(nT) =r„-5^7^^l— y WYin,i)T^^, n = l,...,N. 



Here we remark that the above estimators may be noisy to some extent since they are 
pointwise in nature, and we can apply some curve fitting techniques to the above prelim- 
inary estimates in order to obtain more stable estimators. We may also consider another 
reconstruction formula, also equipped with the CWT: 

/oo /"OO I 7 

/ Wf{a,b)a-''/^ij(—)dadb, 
.In \ a / 
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where is the constant for the reconstruction and the integration with respect to h helps 
to smooth the reconstruction estimator in time. We will not get into these numerical 
details in this paper, however. 



3.3. Theory 

Before stating the robustness theorems, we make the following assumptions and define 
some notation. 

Assumption (Al): Assume the mother wavelet e 5 is chosen such that ^/^ C [1 — 
A, 1 + A] and = 1, where A < d/{l + d). 

Notation (Nl): Suppose the power spectrum drj of the given GRP $ satisfies := 
J(l + I'fD^^'dyy < oo for some I > 0. Denote by Ei and E2 constants depending on $, 
ci, C2, d and the zeros and first moments of 'tp^'^\ k = 1, . . . ,1. These constants are related 
to the noise level and are specified in the proof of Theorem |3.1| 

Notation (N2): Denote by E\y, E^ and E,. universal constants depending on the mo- 
ments of ip and tp' , Ci, C2 and d defined in Lemma S.2.1 in the Supplementary. These 
constants are related to the model bias introduced by the model A^^f^ . 

We now state the robustness property of SST when the seasonality plus trend signal is 
contaminated by an "almost" stationary GRP. The proof is postponed to the Supplemen- 
tary. 



Theorem 3.1. Suppose Y{t) follows model (12) and assumption (Al) holds. We fur- 
ther assume a g C"°^'''f"'^''^(M) so that ||cr||ioo <c 1 and := max^^^ ^^•^ } ^ 
1, and var$ (■(/;) = 1. Then, when e is small enough, for each b G R we have the following 
results. 

(i) For each a £ [ ^^^ , ^~jr^]7 with probability higher than 1 — ^^(Elaib^)^^^^ ' have 

K 

Wy(a,6)-^yli(5)e^2.0,(b)^^(^^,(^))| < (^E^a{b)f' + Ewe; 
1=1 

(it) For each a G [|j=^, with \Wf{a,b)\ > {Ei(7{b)f/^ + e^^^ , where k^l,...,K, 

with probability greater than 1 — (32c2+5) (Ei<j(b)+E2e„) have 

((£;i<T(b))2/3+ei/3) 

Ma, b) - 0,(5)1 < \;Wj{aM 

, (i?ia(6))2/3 + ei/3 E^e 



\Wf{a,h)\ 2iT\Wf{a,b)y 
(Hi) With 7 > 1 and with probability higher than 1 — 7"^, for k = 1, . . . , K we have 

/ WY{a,b)x\WY(aM\>E,aib)a'^^^da~Ak{b)e^^'^''^'^\ 

J Zkib) 

<( m ^0 iEMb)+E2e^? \ iE,a{b)+E2e,y A 
- ' ci (i?ia(5))2 J EMb) 

+ {Ewe + Eia{b))'^A. 

'<"tp 
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We have some remarks about the theorem. 

(a) Note that finite-order polynomials and harmonic functions with "very low" frequen- 
cies are special cases of the trend model considered in Theorem |3.1[ 



(b) By Lemma S.5.1 in the Supplementary, a stationary CARMA(p, process, where 
p,q > 0, satisfies the conditions in Theorem |3.1| for some / > 0, so the theorem 
applies when $ is taken as a stationary CARMA(p, q) process. 



(c) From (ii) in Theorem 3.1 it is clear that when we estimate the instantaneous fre- 
quency (l>'kib)j the larger the |W/(a, 6)| is the smaller the estimation error is. Intu- 
itively, the smaller |W/(a, 6)| is, a larger estimation error might be introduced since 
|VF/(a,6)| is in the denominator. 



(d) Notice that there are two terms in result (iii) of Theorem 3.1 The first term comes 
from the error process cr^ and its heteroskedasticity, and the second term comes from 
the model bias with A^'^''^^ /A'^''^'^^ modeling the seasonality. Also notice that when 
a{b) dominates e and e^i the estimation error is of the same order as <j{b), which is the 
standard deviation of the error process. In this case, if we have interest in recovering 
the error process, we may choose a smaller A so that the estimation error is smaller 
than a(b) and hence with high probability the realization of the error process can 
be approximated accurately. We also comment that in the proof the autocorrelation 
structure of the error process is not used; by taking this structure into consideration, 
we may achieve a better estimator. Also, the reconstruction formula can be viewed 
as an adaptive bandpass filter which removes the energy of the noise out of the range 
of interest. 

Next, consider the case where we have discrete-time observations Y = {Y,i}n£Z from 



model (13). Note that although all discretized CARMA GRP are ARMA time series 
Brockwell and Hannig (2010), not every ARMA(p, g) time series can be embedded into 



a CARMA(p',q') GRP for some p',q' € N U {0} [Brockwell (1995[ ). Thus, if we model 



the dependent errors in ( 13 ) directly by an ARMA(p, q) time series but not a discretized 



CARMA GRP, Theorem 3.1 may not be always applicable. Denote the sampled values 
of the seasonality / G A'^^^^ by / := {/('^T)}„gz, the sampled values of the trend T by 
T := {r(7iT)}„gz, and the errors by X := {X^ = o-(nr)$„}„gz so that r„ = /(nr) + 
T{nT) + (T(nT)$„, where r > is the sampling interval. Here we denote the discretized 
version of CWT and its derivative in b by: 



3/2 ^ a 

Then the following theorem states the robust property of SST when the data come from 



model (13). We introduce the following additional assumptions on the model and further 



notation. 

Assumption (A2): For the time series Y — {Yn}nei, in model (13), we assume Ak{t) G 
C^{R) and suptgR \Al{t)\ < ec2 for all k = 1, . . . , K . 

Notation (N3): Let Et^i and i?T,2 be constants depending on a given error process 
ci,C2,d and the zeros and first moments of ■0 and V''- These constants are related to the 
noise level and are influenced by the sampling interval t. They are specified in the proof 
of Theorem IHH 
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Notation (N4): Denote by Er^w, -E't.w and i?T,r universal constants depending on the 
moments of if) and ?/;', ci, C2 and d, which are defined in Lemma S.2.2 in the Supplementary. 
These constants are related to the model bias introduced by the A^^^f^ class and are 
influenced by the sampling interval r. 



Theorem 3.2. Take a time series Y = {Yn}n£Z following model [13) and suppose 
assumptions (Al) and (A2) hold. Assume a £ C^(M) so that \\a\\L^ ^ 1, := 
max{||CT'||Loo, ||ct"||loo} <g; 1, the sampling interval t satisfies < r < ^^"^ 
if e is small enough, for each b CzR we have the following results. 



Th 



en, 



(i) For each a G 
have 



r i-A i+A i 

L C2 ' Ci J 



with probability higher than 1 



(E,,i<t(6))4/3 



we 



K 



1=1 

< {ErMb)?'^ + {Ew + T^Er^w)e; 
(it) For each a G ^] with \Wf{a,b)\ > {Eia{b))^/^ + e^^^ , where k^l,...,K, 



with probability greater than I — (32c2+5) {E^_i(7{b)+T Er,2ta) 



, we have 



((£;,,l£7(fc))2/3+el/3) 

(42C2 + 4){Er,i(j{b) + T^Er.2e,y 
\Wfia,bW 

(i?,.l(7(6))2/3 + el/3 



+ 



Et.lj^ 



\Wf{a,b)\ 27T\Wf{a,b)y 
(Hi) With 7 > 1 and with probability higher than 1 — 7^^, for k = 1, . . . , K we have 



< 



WY{a, b)x\w^(^a,b)\>E^,^.{b)a-^'^Aa - Afc(6)e2-^'^'(^) | 

40 {Er.Mh) + T-'Er^^egf \ (Er.Mb) + r^E^.^eg)^ A 
ci {Er,Mb))^ J ErMb) 



{[T^Er.w + Ew]e + ErMb)) A. 



The comments (a), (c) and (d) given immediately after Theorem 3.1 still hold for 



Theorem |3.2| But we have more comments for Theorem |3.2| regarding the discrete-time 
samples. 

(e) The regularity condition on Ai{t) is added for the purpose of demonstrating the 
interaction between the model bias parameter e and the discretization effect, as can 
be seen from (S.24) of the Supplementary. 

(f) The condition on the sampling interval r rings a bell of the Nyquist rate. Indeed, 
since locally the signal oscillates in a way close to harmonics, we expect to see that 
its spectrum is "essentially supported" on the frequency range [ci,C2]. This fact can 
be seen in the proof of the theorem. Thus, the sampling interval r has to be shorter 
than l/c2 in order to avoid the aliasing effect introduced by the discretization. 
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With Theorem 3.1 and Theorem 3.2 we summary the main properties of SST that 



render it suitable for use in determining the seasonahty. 

(PI) Fix a harmonic function f{t) — cos(27r^fe<), where Ak > and S,k > 0. It 

is well known that its Fourier transform is /(^) = Sfc=i^fe('^Cfc + ^-ik)/"^' which 
leads to the time- frequency representation, or time- varying spectruirjfj Rf{t,S^) = 
■^^^=1 AkS^^{£_) when ^ > 0. In this case, the time-varying spectrum does not 
depend on time indeed. Ideally, given a function f{t) = X^a—i ^k{t) cos{2tt (j)k{t)) so 
that Ak{t) > and 4>'i.{t) > for t G R, we would expect to have the "time- varying 

spectrum" Rf{t,£^) reading like = ^ X]fe=i ^fe(^)'^</''fc(i)(C)j where 6 denotes 

the Dirac delta function. This expectation can be fulfilled to some extent according 
to (ii) in Theorem 3.1 and Thereom 3.2 when / e A'^^f'^ . Indeed, it tells us that the 



and Thereom 
SST provides an approximation to t 

^7 



lis "ideal spectrum" when / g A'^^^f^ since only 



near {t,(j)'f.{t)) can the value of \S'j{t,^)\ be dominant. This property further allows 
an easy visualization of the instantaneous frequency 4>'i^{t), if the seasonality exists. 
Similarly, the intensity of the dominant value reflects the value of the amplitude 
modulation Ak{t). 

(P2) SST is an invertible transformation in the sense that we can reconstruct each com- 



ponent of f{t) accurately, as is shown in Theorem 3.1 and Theorem 3.2 Once the 
existence of seasonality is confirmed by SST and the period of time within which 
the seasonality exists, this property allows us to recover the seasonal oscillation, for 
example, of the epidemic system so that we can determine when the incidence of the 
disease is highest. 



(P3) The existence of the trend modeled in Theorem 3.1 and Theorem 3.2 do not interfere 
with the seasonality estimation. This property allows us to estimate the trend even 
when seasonality exists, which is important since in some situations the main focus 
is trend estimation and the seasonality is regarded as a nuisance parameter. 

(P4) The properties (P1)-(P3) are robust to the existence of stationary GRP multiplied 
by a slowly varying function in the continuous-time case, or stationary time series 
multiplied by a discretized slowly varying function in the discrete-time case, for 
example, the CARMA(p, q) GRP or ARMA errors. It is this important property 
that renders SST applicable in the seasonality analysis. 

(P5) Since the estimation procedure is local in nature, the result derived from SST is 
insensitive to the length of the observed time series in the following sense. In practice 
the collected data accumulate according to time. Suppose we observe a time series 
Yi from time to time ti, ti > 0, run some analysis on Yi and get an estimator Yi. 
Then we continue the data collection process and reach time t2 > ti. Denot^ the 
collected time series as Y2, run the same analysis on Y2 and get an estimator ¥2- It 
is important to have the property that Y2|[o,til is close to Yi; otherwise the analysis 
may be meaningless. 

(P6) The constants appearing in the estimation errors, for example those defined in (Nl)- 
(N4), depend only on the higher order moments of the chosen mother wavelet ip but 
not on the profiles (or shape) of tp. Thus, choice of the mother wavelet is not crucial 

fNote that we use ^Ifc instead of Al to simply the discussion. 
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to ensure properties (P1)-(P5). In this sense, we say that the method is adaptive. 
Indeed, one can even show that CWT is not essential in the whole algorithm in the 
sense that the variational approach is possible Daubechies et al. (2010). 



4. Simulated Examples 

To demonstrate the capability of SST to detect dynamical seasonality and other proper- 
ties, we tested it on two simulation examples and report the results in this section. Since 
the TBATS model is not designed for seasonality with dynamics, in the first simulation ex- 
ample we consider the seasonality is composed of multiple pure trigonometric components 
and compare SST and TBATS. Note that in the first simulation setting no dynamics exists 
in the seasonality. In the second simulation setting we consider the seasonality modeled 
by A'^^^^^ and show the main difference between SST and TBATS. We compared SST with 
TBATS in that TBATS is so far the best algorithm in seasonality analysis, to the best of 
our knowledge. The code for implementation of SST is in the authors' homepage[|] We 
called the R forecast package to run TBAT^ We ran the simulation and data analysis 
on a macbook having 4GB 1333 MHz DDR3 ram, 1.7 GHz Intel Core 15 CPUs. 



4. 1. Simulation settings 

Define the following two functions modeling the seasonality: 

siA{t) ■— 2.5 cos(27rt), si_2 ■— 3cos(27r^i) 
si{t) := si,i{t) + si^2{t) 

and 

Ai{t) :=2-h0.5(l-H0.1cos(t))arctan(t-13), A2(i) := 3.5x[o,7.5] (*) + 2x(7.5,io] W 

0i(t) t + 0.1 sin(i), 02(t) 3.4t - 0.02^^-3 

S2,i(0 := Ai(<)cos(27r(/)i(t)), S2,2 A2(t) cos (27r02 W) 

S2(i) S2a(0 + S2.2{t), 

where x is the indicator function. Note that si is composed of two harmonic functions and 
their frequencies are not integer multiple of each other, and S2 models the seasonality with 
time-varying behavior. By definition, S2 is composed of two IMFs with instantaneous 
frequencies ^^(i) = 1 -|- O.lcos(i) and (j)'2{t) = 3.4 - 0M6t^-^. We chose the following 
function to model the trend: 

We then considered the following clean signals: 

hit) :=si(t)+r(t), /2(t) :^S2{t)+T{t). 

We demonstrate the clean Ai{t), A2(t), (t>2(.t), T(t) and S2{t) + T[t) in Figure [ij 

|http : //www .math.princeton. edu/~hauwu 
?http : //robjhyndman. com/sof tware/f orecast/| 
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Fig. 1. The clean signals in /2. Left column, from top to bottom: Ai{t), <l>'i{t), S2,i(t) and T(t); 
right column, from top to bottom: A2(t), (j)'2{t), S2,2(i), T{t) and /2(i). Note that the instantaneous 
frequency and the amplitude modulation functions are not constant, which model the dynamics of 
the system. 

We discretized the clean signals in the time period [0, 10] with the sampling interval 
r = 1/100 so that we have N = 1000 sampling points. The sampled time series is denoted 
as fi e so that the Z-th entry of fi is /i(Zt) for alH = 1,...,A^. Similarly, the 
sampled time series on /2, si, S2 and T are denoted as f^, si, S2 and T, respectively. 

To model the noise, we took 

Xa{n) := a-(nr)(4Xi(n)x„e[i,Af/2]("') +-^2(n)Xne[iv/2+i,Af](n)), 
X{n) := 2a{nT)Xi{n) 

where a{nT) = 1 + 0.1 cos(7tot), and Xi (resp. X2) is an ARMA(1,1) time series deter- 
mined by the autoregrcssion polynomial a{z) = 0.5z + 1 (resp. a(z) = — 0.2z + 1) and the 
moving averaging polynomial b{z) = OAz+ 1 (resp. b{z) = 0.51z + 1), with the innovation 
process taken as i.i.d. student t4 random variables. The sampled time series on Xcr and 
X arc denoted as X^r G and X G M.^ respc;ctivcly. Note that is non-stationary. 
We then tested our algorithm on the following three time series: 

Yo:=fi+X, Y,:=f2 + X„, Y2 := + ^Xgarck, 

where Xgarch is a GARCH(1,2) time series with ARCH coefficients (1,0.2), GARCH 
coefficients (0.2, 0.3) and N(0, 1) disturbances . 

4.2. The SST tested on the clean signal /2 

We start from demonstrating the result of applying SST to the clean signal f2- We took 

ip £ S so that = cxp ( 3)2_i ) ■ To reduce the boundary effect, we padded 

both ends of the signal by symmetric reflections when we ran SST. Notice that doing 
so is not the optimal solution in coping with the boundary effect, but is only for our 
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convenience. The results are shown in Figure |2] from which we have the fohowing findings. 
First, the instantaneous frequency functions of S2,i and S2,2 can be seen clearly from the 
dominant curves in the left panel. The time-varying amplitude modulation functions are 
also visually clear in the SST representation. Indeed, the smaller the amplitude is, the 
lighter the intensity of the dominant curve is. The reconstruction of each component and 
the trend are shown in the right column. It can be seen that except for S2,2 and near 
the change-point 7.5, the reconstruction is satisfactory. Note that this kind of "sudden 
changes" in the signal S2.2 is not theoretically analyzed nor numerically improved in the 
current paper. 
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Fig. 2. Synchrosqueezing Transform applied to fz- Left: The SST results on the clean signal fz, 
shown in the bottom right panel in FigureQ] The x-axis is time and the y-axis is frequency. Right, 
from top to bottom: the S2,i, S2,2 and T are shown as the black curves, and S2.1, S2.2 and T are 
respectively superimposed as the red curves. It is clear from the figure that the reconstruction is 
almost exact except for 82,2 at time 7.5. 



4.3. Comparison of SST and TBATS on Yq 

We further analyzed 200 realizations of l^o using SST and TBATS. To speed up the 
computation of TBATS, based on the fact that the true seasonal periods are 100 and 
IOO/tt, we ran TBATS with 2 seasonal components and the seasonal periods ranging 
respectively in {100/0.96, 100/0.98, 100, 100/1.02, 100/1.04} and {100/(7r- 0.04), 100/(7r- 
0.02), IOO/tt, 100/(7r + 0.02), 100/(7r -1- 0.04)}. We determined the "best" seasonal period 
lengths based on the AIC values of the fitted TBATS models. If more than one pair of 
seasonal periods yielded the same minimal AIC value and the pair {100, IOO/tt} is one 
of them, we picked {100, IOO/tt}. In this case, the chosen optimal seasonal periods were 
consistently equal to 100 and IOO/tt for all the realizations. 

We report the relative root average square estimation error (RRASE) to measure the 
estimation accuracy of the two different estimators. Denote by Si.i, S12 and T generic 
estimators of Si^i, Si^2 and T, respectively, and denote by Xa- :~ Yq — Si^i — S12 — T 
the residuals, which can be used to analyze the random errors X^. The RRASE, and 
its standard deviation, results given by SST and TBATS for Yq are shown in Table [T] 
The computation time (in seconds) of SST and TBATS, and its standard deviation, are 
reported as well. In Figure |3j we demonstrate the results for the realization which yielded 
the median RRASE value, among all the realizations. Note that the seasonal components 
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Table 1. Yo'- Two harmonic seasonal components and trend contaminated by stationary noise. 



Results for Yo 






Sl.l 


Sl,2 


T 


Y^ 


Time 




SST 


0.128 ±0.034 


0.166 ±0.025 


0.02 ± 0.005 


0.163 ±0.02 


7.91 ±0.37 




TBATS 


0.109 ±0.06 


0.097 ±0.058 


0.026 ± 0.007 


0.144 ±0.035 


115.3 ± 14.6 





in Yq are purely harmonic and the true values of the seasonal periods were contained in 
the period candidate sets when we ran TBATS, thus it estimated well the signals. On 
the other hand, SST resulted in smaller RRASE standard deviation, although it yielded 
larger RRASE (because it has to estimate the signals nonparametrically from the noisy 
data). Also, notice that the seasonal component with low frequency determined by TBATS 
contains artificial local extrema inside each oscillation. These artificial local extrema might 
lead to misinterpretation and have to be taken into consideration when using TBATS. 




02468 10 02468 10 

Fig. 3. Performances of TBATS and SST on Yq. Left (right): The TBATS (SST) results for the 
realization of Yo with the median RRASE value. J"he first^row shows the realization of Yq. The 
second to fifth rows show respectively si,i, si,2, T and (black curves), with the si,i, si,2, T 
and X„ superimposed as red curves. 



4.4. The SST tested on signals with dynamics and non-stationary noise 

When we analyzed the processes Yi and Y2 using TBATS, based on the ground truth 
we set 2 seasonal components with the period lengths ranging in Ii — [100/1.05, 100/0.95] 
and I2 = [100/3.2, 100/2.6] respectively. Notice that the chosen Ii and I2 respectively 
contain the ranges of (t) and (/)2 (t) . We divided each of Ii and I2 into 5 equally spaced 
points, and determine the "best" seasonal periods based on the AlC values of the fitted 
TBATS models. In both cases of Yi and Y2, the chosen optimal seasonal periods varied 
from time to time, among the 200 realizations. Since the TBATS algorithm is not designed 
for analyzing the processes Yi and Y2, it fails consistently. Please see Figure |4] for an 
example. 

The RRASE, and the standard deviation, of the results by SST for Yi and Y2 are 
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Fig. 4. Performance of TBATS on Yi andY-z. Left (resp. right): The TBATS results of Yi (resp. 
V2). First row: the realizations of Yr and Yz- Second (resp. third) row: the S2,i(t) and S2,i(t) 
(resp. S2,2(t) and S2,2(t)) are respectively represented by the red curve and the black curve. Bottom 
row: the T{t) and T{t) are respectively represented by the red curve and the black curve. Note 
that TBATS tends to fit oscillations even if the oscillations do not exist. 

Table 2. Y^ and Y2'- Two dynamic seasonal components and trend, contaminated by 



different kinds of non-stationary noise. 



Results for Y i 




S2,l 


S2,2 


T 




Time 


SST 


0.255 ±0.066 


0.236 ±0.04 


0.036 ±0.011 


0.184 ±0.023 


8.17 ±0.55 


Results for Y2 




S2,l 


S2,2 


T 


-X^GARCH 


Time 


SST 


0.225 ± 0.046 


0.21 ±0.031 


0.031 ±0.008 


0.184 ±0.018 


8.82 ±0.33 



shown in Table [2) The average computation time (in seconds) and the standard deviation 
are reported as well. Among all the 200 realizations, we demonstrate the results for one 
which gave the median RRASE in Figure [5] and Figure [6j 

4.5. Summary 

Notice that ahhough SST does not outperform TBATS when tested on Y^, TBATS col- 
lapses while SST can provide reasonable results in analyzing Yi and Y2- In the case 
of it should be noted that although TBATS can accurately estimate the periods of 
both of the seasonal components, the oscillation pattern tends to deviate from the cosine 
function, which is the ground truth in the simulation. On the other hand, due to noise, 
as is expected from the results in Theorem |3.2[ the performance of SST in estimating the 
amplitude modulation is not as good as it is in estimating the instantaneous frequency; 
however the oscillation is not distorted. We should also point out that the seasonal compo- 
nents in the TBATS model have constant frequencies, which drives it to outperform SST in 
analyzing Yq. On the other hand, from Theorem |2.1| and Theorem |2.2[ the representation 
of a function composed of harmonic functions is not unique, and SST can only determine 
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Fig. 5. Performance of SST on Yi. Top left: The realization of Yi which yielded the median 
RRASE. Bottom left: SST of the realization of l^i. Visually we can see the two seasonal com- 
ponents, and how their instantaneous frequency functions change. And we can see the intensity 
of the second component changes at about t = 7.5. Right column: From top to bottom the black 
curves are S2,i, S2,2, f and with the respective s2a, s2,2, T and X„^ superimposed as red 
curves. 



the seasonal components up to some degree of accuracy. Nevertheless, Table [T] conveys a 
clear message that the SST still enjoys reasonable performance relative to the TBATS in 
this case. Also, we can see from Table [2] that the performances of SST on Yi and Y2 do 
not differ much. This result can be explained by Theorem |3.2[ On the other hand, when 
the seasonality is modeled by the A'^^^'^^ class and the noise is non-stationary, TBATS fails 
constantly since these kinds of signals and noise violate the parametric model assumptions 
of TBATS. This explains the deteriorated performance of TBATS when tested on Yi and 
1^2 • Therefore, when we have a priori knowledge that the seasonality is not dynamic, 
TBATS should be chosen; however, in general we would suggest to try SST, at least to 
test if the seasonality is dynamical or not. 



5. Real Examples: Seasonal Dynamics of Varicella and Herpes Zoster 



The varicella-zoster virus (VZV) causes two distinct diseases, varicella (chickenpox) and 
herpes zoster (HZ) i.e. shingles. Varicella occurs primarily in children and adolescents 
and features a seasonal pattern with the peak incidence happening in the winter, see [Chan] 
et al. (2011). In contrast, HZ occurs mostly in adults and elders who had varicella in 
their childhood so that VZV resides in their sensory ganglia. When the subject's immune 
function declines, the reactivation of the residential VZV may lead to HZ. The existence 
of seasonality in the incidence of HZ has been less studied and contradictory conclusions 



were reported in Gallerani and Manfredini (2000) and Perez-Farinos et al. (2007). 

Beyond the existence of seasonality, the dynamics of the seasonality, for example, 
the relationship between the strength of seasonality and incidence rate of the varicella 
disease has been less quantified in the literature. In particular, the effect of the public 
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Fig. 6. Performance of SST on Y2. Top left: The realization of Y2 which yielded the median 
RRASE. Bottom left: SST of the realization of Y2- Visually we can see the two seasonal com- 
ponents, how their instantaneous frequency functions change and we can see the second_com- 
ponent disappears at about t = 7.5. Right column, from top to bottom are 32,1, S2,2, T and 
^GARCH - ^GARCH (black curvos), with the S2,i, S2,2, T and Xqarch superimposed as red curves. 



vaccination program on the seasonal dynamics has been less studied. As varicella is a 
highly contagious disease but can be effectively prevented by 70%-80% using varicella 
vaccines, see for example Marin et al. (2008), the free varicella vaccination was available 
to certain areas in Taiwan starting from 2003. A nationwide vaccination program was 
then launched in Taiwan in 2004. The program encouraged children aged 12-18 months 
to receive free vaccine against varicella and the vaccination rate was as high as 85% in 



2004 and then reached a plateau at 95% afterward. It has been noted in Chao et al. 



|(2012[ ) that the public vaccination program was a considerable success and the incidence 
rate of varicella dropped sharply by 70-80% after 2004. For HZ, a vaccine program was 
promoted since 2008, and its effect on the public health is not yet clear. To investigate 
how the seasonal pattern of varicella and HZ influenced by the vaccination, we carry out 
the following data analysis. 

All out-patient visit records of a representative cohort derived from the Taiwan's 
National Health Insurance Research Database (NHIRD) were analyzed. The Taiwan's 
NHIRD consists of dc-identified and encrypted medical claims made by its 23 million 
inhabitants and is publicly available to medical researchers in Taiwan. This nationwide 
database provides accurate estimates of disease incidences because of its high coverage 
rate, which had reached 99% of Taiwan's population by 2009. Moreover, the Bureau of 
National Health Insurance (BNHI) of Taiwan performs regular cross-check and validation 
of the medical charts and claims to ensure the reliability of diagnosis coding, thus enabling 
NHIRD as a reliable resource for public health studies, see Chen et al. (2011 ). For research 
purposes, 1 million (i.e. 4.3%) representative inhabitants were randomly selected from Tai- 
wan's 23 million inhabitants by the National Health Research Institute (NHRI) of Taiwan 
(http://nhird.nhri.org.tw/en/index.htm) so that the age- and gender- structures are 
identical to the whole population, and their full medical records were collected. 
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The weekly cumulative incidence rates of varicella and HZ were calculated using out- 
patient visit records of the 1 million representative cohort dataset of NHIRD. Since varicella 
often occurs during pre-school and school age, all subjects aged below 10 between January 
1, 2000 and December 31, 2009 were enrolled. Every patient in this group with the first- 
time diagnosis of varicella, under the International Classification of Disease, 9th Version 
(ICD-9) code 052.x, was identified and included in the calculation of the incidence of 
varicella. The date of the incidence of varicella was designated as each patient's incidence 
date of varicella. Those with prior varicella (any varicella cases diagnosed before January 
1, 2000) were excluded to ensure the validity of incident date. We denote the induced 
weekly varicella incidence time series multiplied by 1000 a.s Yyln], n = 1, . . . , 540. 

While varicella occurs mostly during childhood, HZ mainly occurs in patients with 
elder age. Subjects aged over 65 between January 1, 2000 and December 31, 2009 were 
enrolled and traced for their first-time diagnosis of HZ (ICD-9 code 053.x). The dates of 
incident of HZ were collected to calculate the incidence of HZ. Those with prior HZ (any 
varicella diagnosed before the January 1, 2000) were also excluded to ensure the validity 
of incident date. We denote the induced weekly HZ incidence time series multiplied by 
10000 by FhzN, n = 1, . . . , 540. 

We view each of the incidence time series, Yyln] and 1hz["-], as the discretization of a 
single-component periodic function in the -4^^^"^^ function class contaminated by an ARMA 
process. The incidence time series YvM and 1hz[?^] are analyzed by SST and TBATS and 
the results are shown in Figure [Tj Figure [8] and J^'igure [9)_The reconstructed seasonality 
and trend are denoted as sy[n\ (or shz["]) and Tv[7i] (or Tiiz[n\) respectively. 
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Fig. 7. Varicella data analyzed using SST. Left top: The varicella incidence time series Yy (black) 
is superimposed with sv -I- 7v (green). Left bottom: The SST result for IV, where the y-axis is the 
frequency and the intensity of the graph is the absolute value of the SST of Yv |2T|. Right column: 
From top to bottom are the estimated trend Tv (red) superimposed with Yv (black), the estimated 
seasonality Sv, and the residual term YV - s v - 7v. It is clear that the seasonality pattern between 
2006 and 2008 is different from that in the other years. In both columns, the a;-axes is indexed by 
year ranging from 2000 to 2010. 

We summarize the findings from Figure [7] here. First, notice that the seasonality, 
the dominant curve on the time-frequency plane, is graphically visible based on the SST 
analysis, as is expected from the properties (P1)-(P4). Second, with the estimate sv['^]) we 
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Fig. 8. Herpes Zoster data analyzed using SST. Left top: The Herpes Zoster incidence time series 
Vhz (black) superimposed witli shz + Thz (green). Left bottom: The SST result for Y»z, where the 
y-axis is the frequency and the intensity of the graph is the absolute value of the SST of Fhz (21) . 
Right: From top to bottom are Yhz (black) superimposed with the estimated trend Thz (red), the 
estimated seasonality shz, and the residual term Yhz - shz - Thz- 



can tell the dynamics of the seasonality. Before the nationwide public vaccination program 
in 2004, the seasonal behavior of varicella was stable and evident: it climbed gradually 
after September or October, reached the peak level in December and the next January, 
and then declined down to the base during June and July. The finding is compatible 
with that of previous studies in Hong Kong and Denmark without public vaccination 



program, given in Chan et al. (2011) and Metcalf et al. (2009). After the launch of 



public vaccination program in 2004, accordant with the increase of the vaccination rate, 
the winter peak shifted slightly toward spring between 2004 and 2008. This finding is 
consistent with the result of vaccination program in the United States, reported in [Seward] 
et al. (2002[ ). More importantly, less oscillatory seasonality is observed after the launch 



of public immunization in 2004. This finding is important and less reported before. The 



estimated trend of varicella incidence of is compatible with the finding in Chang et al. 



pOTI| ), and we refer the readers to the paper for more discussion. The obvious drop in the 
trend starting from 2003 may be explained by the free varicella vaccination program in 
2003, which was subsequently accelerated by the nationwide public vaccination program 
commenced in 2004. Both the sharp decline during 2000-2001 and the increase during 
2002 in the trend are less conclusive by this analysis; instead they may have been simply 
artifacts caused by the transition of the coding system. In Taiwan, the whole medical 
claim system had undergone a transition from a localized coding scheme (A-code) to the 
international standardized coding scheme (International Classification of Disease, ICD-9), 
which was not completed until 2002. The gradual decrease in the trend starting from 
2005 and the fact that the trend seems to level off starting from 2008 may be interpreted 
as the expected impact of the vaccination program. Clearly, the SST analysis showed 
its robustness to the coding bias problem in 2000-2002, when recovering the trend in 
2003-2010. 

Although it is not easy to tell directly from the HZ incidence time series, visually 
we can detect the existence of the seasonality from the time frequency representation of 
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YhzM provided by SST, as can be seen in Figure [s] The existence of the seasonaUty 
is supported by the fact that the occurrence of HZ in elders is a response to the T cell- 
mediated immunity which is weakened during winter, see for example Altizer et al. (2006| . 
From the reconstructed seasonality shz['t-] and the reconstructed trend, we see that while 
there was a 50% increase in the HZ incidence rate among elders after the implementation 
of nationwide varicella vaccination program in 2004, its seasonality is dynamical - the 
frequency was slightly higher before 2002 and slightly lower around 2007. (Notice that the 
boundary effect should contribute little to this finding since we symmetrically reflect the 
data on both sides separately.) The finding is rarely reported before but supported by the 
host immunity (human immunity) mechanism, see Dowell (2001). Our finding about the 
HZ incidence time series also coincides with the finding reported in Chao et al. (2012| , 
that is, the incidence of the herpes zoster in elders increased after the implementation 
of the free varicella vaccination programme. In particular, overall the trend was stable 
during 2003-2008. Notice that around 2008 there was a significant drop in both IrzH 
and the estimated trend, which might have come from a systematic change. However, we 
do not have enough evidence to explain this significant drop, and the underling mechanism 
deserves further investigation. Further study of the herd immunity, the interaction between 
varicella and herpes zoster and the host immunity mechanism is needed but out of scope 
of this paper. 

To compare the findings based on the analyses of varicella and HZ incidences using 
SST with that using TBATS, we now look at the results obtained from TBATS, which are 
summarized in Figure [9j For the varicella time series Yy, we can see that the estimated 
seasonality plus trend deterministic component, i.e. sy -|- TV, follows Yy well, while the 
estimated trend itself (Ty in red) does not. In particular, after 2005, the trend seems 
to oscillate in the opposite direction with respect to Yy. This can be explained by the 
fact that the unexplained part of the deterministic signal, that is, the dynamics in the 
seasonality, is absorbed by the trend estimate so as that TBATS can fit a nice seasonality 
to the time series Yy. Indeed, the optimal seasonal period chosen by TBATS was exactly 
52, that is, a year, which does not agree with the expectation that the public vaccination 
program should cause changes in the seasonal pattern. In addition, from the public health 
viewpoint, it is somehow difficult to interpret the second peak in March in the seasonal 
component. For the herpes zoster time series Ihz, we see that while both the trend plus 
seasonality estimate (shz + Thz in green) and the trend estimate (Thz in red) follow Ihz 
closely, the trend estimate appears to oscillate to same extent. Again, this inevitable for 
TBATS because any unexplained dynamic oscillation is absorbed by the trend estimate. 
For the estimated seasonality, TBATS shows that its period is 51.2, that is, less than a 
year. To sum up, we find the results given by TBATS are harder to interpret when the 
seasonal periods are dynamical. 



6. Discussion 

We introduce a new nonparametric model and an adaptive time frequency analysis tech- 
nique, referred to as Synchrosqueezing transform (SST), to analyze time series with dy- 
namical seasonal behavior and smooth trend, which is contaminated by stationary time 
series modulated by slowly varying variability. Apart from the identifiability results for the 
nonparametric seasonality model, theoretical results are provided to justify and quantify 
the capability of SST to extract the seasonal component and the trend from the noisy ob- 
servations. In our numerical study, besides testing SST on simulated data and comparing 
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Fig. 9. Varicella and Herpes Zoster data analyzed by TBATS. Left: From top to bottom are Vv 
(black) and sv + Tv (green); "VV (black) and Ty (red); sv (black); and Yv - s v - 7v (black). Right: 
From top to bottom are Vhz (black) and shz + Thz (green); Yhz (black) and Thz (red); shz (black); and 
ihz - SHZ - Thz (black). In both columns, the x-axes is indexed by year ranging from 2000 to 2010. 



it with the recently developed method TBATS, we applied SST to evaluate the influence 
of the general application of the vaccine on the seasonal dynamics of two highly related 
diseases, varicella and herpes zoster. 

We list below a set of open problems here to complete the discussion. 

• Although the influence of the stationary random process on SST is carefully ana- 
lyzed, more studies on the statistical side are needed. For example, how to setup 
a hypothesis test to identify the existence of the seasonality component? What is 
the best threshold in SST, and in which sense? Forecasting is another important 
issue in the time series analysis. Some modifications of the current SST algorithm, 
such as boundary modification, are required in order to achieve this aim. We will 
investigation these problems and report our findings in the near future. 



It is sometimes interesting to understand the structure of the error process. For 
example, this is important when we want to forecast future values. There exist 



several algorithms aiming at identifying CARMA (or ARMA) processes (Brockwell 



and Davis (2002 ); [Brockwell (2001 )). In addition, when the noise is modeled as a 
CARMA or ARMA random process modulated by a slowly varying function (j{t), 
we may apply to the residuals the algorithms discussed in the literature, for example 



Dahlhaus (lOQ?] ); [Hallin (19781 11980|); [ Rosen et al. (200"9| . Although the error pro- 



cess can be efficiently separated from the observed time series by SST (indicated by 
Theorem 3.1 result (iii), Theorem 3.2 result (iii), and Figures [sl [5| and [6|) , the theo- 



retical quantification of the infiuence of SST on such an approach remains unknown, 
and further investigation is needed. 

Since the proposed model for the seasonality is nonparametric in nature, the results 
from the SST algorithm may be used to check the validity of some specific sub-models 
such as constant seasonal periods. Further investigation in this direction is needed. 
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• From the clinical viewpoint, although a preliminary interpretation and survey of the 
inspected seasonality is provided, further clinical studies are needed to understand 
the interplay between the seasonality and vaccination. For example, the herd im- 
munity, the interaction between varicella and herpes zoster and the host immunity 
mechanism. 
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S . 1 . P roof of Theo rem l2?f] 

The proof contains two parts. The first part is showing the restrictions on the perturbations 
a and /3 based on the positivity condition of the instantaneous frequency and amphtude 
modulation functions. The second part is to control the amplitude of a and /3 and their 
derivatives by the "slowly varying" conditions oi A'^^''^^ functional class. 

First, observe that if (14 1 holds, then by the definition of A'^^''^^, we know 



a e C^{R), 13 e Ci(M), 

inf (j)'{t) > ci, sup0'(i) < C2, 

inf a(t) > Ci, supa(i) < C2, 

tsK tea 

\a'{t)\<e4>'{t), \r{t)\<t<t>'{t), 



and 



inf + a'{t)] > ci, snp[(t>'{t) + a'it)] < C2, 

M[a{t) + l3{t)] > ci, sup[a(t) + /3(i)] < C2, 
*6R teR 

Wit) + (3'{t)\ < e(0'(i) + a'it)), + a"it)\ < e(0'(i) + a'it)). 

From (S.7) we have 

1/3' < e(20'(t) +a'(t)) and < e(20'(t) + a'(t)). 



(S.l) 
(S.2) 

(S.3) 

(S.4) 

(S.5) 
(S.6) 
(S.7) 

(S.8) 



Now, based on the conditions (S.l), (S.2|, (S.3), iS.5» and (^S.6|, we prove that a and /3 



must have some "hinging points" that restrict their behaviors. Notice that the definition of 



tm and Sm depends on the monotonicity of which is true by condition (S.3). Observe 
that, for any n € Z, when t = t„, 



(a(i„) + /3(t„)) cos(0(t„) + a(i„)) 
(a(i„) + /3(t„)) cos[n7r + 7r/2 + ait„)] 
a(t„) cos(n7r + 7r/2) — 0, 



(S.9) 



where the second equality comes from (14 1. This leads to ck(i„) — k^n, fc„ G Z, since 
4'itn) = in + l/2)7r and a(i„) + /3(tn) > by ( |S.6| ). We show that fc„ are the same for all 
71 e Z. First, suppose there exists t„ so that Q!(i„) = fevr and aitn+i) — ik + 21)'k, where 
k,l £ Z and I > 0. Note that k should be an even number, otherwise we have 

a(t„) cos(0(t„)) = (a(t„) + /3(i„)) cos((/)(t„) + a(t„)) = -(a(<„) + ^(t„)) cos(0(i„)). 



which is absurd due to ( |S.2[ ) and ( |S.5 

t' e itn,tn+i) so that a(i' 



Then, it follows from (S.l) that there exists 
ik + l)7r and hence 



ait') cos(0(t')) = (a(i') + /3(t')) cos(</)(i') + a(t')) = -(a(t') + /3(t')) cos((A(i')), 
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which is absurd since cos{(j){t')) ^ 0, a{t') > and a{t') + l3(t') > by and ( [S^ |. 

Similar argument holds when / < 0. Second, suppose there exists i„ so that = kir 

and a(tn+i) = (fc + 2/ — l)7r, where k,l £ Z and Z > 0. If / > 1, the same argument holds. 
If Z = 1, it is absurd again since a{t') > and a{t') + (3{t') > but the sign changes. 

Thus we know a(t„) = kir for all n G Z, for some fixed k € 1. We now show that 
k — Q. Suppose a{tn) — koTT, where fco is a fixed even integer, for all n G Z, then we 
set a to be a — koir and the claim is done since they are equivalent after being composed 
with the cosine function. Suppose fco is a fixed odd integer, then since a G C^(M) and 
a{tn) = a{tn+i) — fcoTT, there exists t' £ (i„,t„+i) so that a{t') = k^n and hence 

a(i') cos(0(t')) = (a(i') + cos(0(t') + a{t')) = -(a(t') + /3(i')) cos(0(t')), 



which is again absurd since cos{(j){t')) ^ 0, a{t') > and a(i') + /3(t') > by (S.2) and 
(S.5|. As a result, we get a(t„) = for all n £ Z. Furthermore, by the fundamental 



theorem of calculus, we know 



= a{tn+i) 



a'{u)du, 



(S.IO) 



which implies that a'{t) changes sign inside [t„, t„+i] for all n £ Z. Also, due to the 
monotonicity of (f) + a (S.6), 



\a{t') - a{t")\ <7T 



(S.ll) 



for any t',t" £ [tm,t,n+i] for all to G Z. Indeed, if \a{t') — ce{t")\ > it, we contradict (S.6) 
since {n + 1/2)tt < cj){t') + a{t') < {n + 3/2)tt and + a{tm+i) = (n + 3/2)7r. 

We next claim that f3{sm) > for all m £ Z. When t = Sm, we have 



(-l)"a(s,„) = a(s,„) cos(TO7r) 

= (a(s,„) + /3(s„)) cos[m7r + a{sm)] 
= (-l)"(a(s„0 + /3(s™)) cos(a(s„0). 



(S.12) 



where the second equality comes from ( 14), which leads to (3{sm) > since | cos(Q;(sm))| < 



1. Notice that (S.12) implies that a{sm) = 2fcm7r, where km S Z, if and only if /3(sm) = 0. 
We now claim that for all to G Z, if f3{sm) — 0, then fc,„ = 0. Without loss of generality, 
assume k„i > 0. Since a £ C^(M), there exists t' £ (tm-i, Sm) so that a{t') = tt and hence 

a{t') cos(0(t')) = (aif) + P{t')) cos(0(<') + a{t')) = -{a{t') + p{t')) cos(0(t')). 



which is absurd since cos(0(t')) ^ 0, a{t') > and a(t') + I3{t') > by (S.2) and (S.5| 



Next we claim that for all m G Z, i f j3{s m) > 0, then |a(s„j)| < n/2. Indeed, since 
< cos(Q:(sm)) = aisl^+i^ism) < 1 by (S.12), we know a{s,n) e (-7r/2, 7r/2) + 2n„7r, where 
rim G Z. By the same argument as in the above, if Um > 0, there exists f £ (tm-i, s„j) so 
that a(t') = TT and hence 

a(i')cos(0(t')) = (a(t') +/3(i'))cos(0(t') 



-(a(<') + /3(i'))cos(0(i')), 



which is absurd since cos{(f>{t')) ^ 0, a{t') > and a(t') + > by and ([S|5]). We 
have thus complete the first part of the proof. 
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Before going to the second part of the proof, notice that by (S.12), the behavior of 
can be further bounded by Taylor's expansion. Indeed, for each m G Z, since 



a(s„) 

|ci(sm)| < 7i'/2, we have 



1 



^ 1 



His,, 



<i(s„) 



(S.13) 



which conies from Taylor's expansion of the cosine function around 0. 

Also notice that by the mean value theorem, for all fc G Z, we have for some t'^ G 

[^m 1 ^m+l] 



m+1 



m+1 



= 4> (tm)^ 



which means that 



(tm+l — trn)4''{t'm) — ■ 

Similarly, we have for some G [s,„, Sm+i] 

(Sm+l - Sm)0'(sj„) = TT. 



(S.14) 



(S.15) 



To finish the proof, we have to consider the conditions (S.4| and (S.7) and show that 
\a'{t)\ < 37re, \a{t)\ < |^ and |/3(i)| < Aire for all t G M. Suppose there existed t' G 
{tm,tm+i) for some m G Z so that > ^Tre. Without loss of generality, we assume 

a'{t') > 0. By the fundamental theorem of calculus and (S.S), for any t G (t„i,t'), we 
know 



\a'it')-a'{t)\ < \a"{u)\du<e {2q^' (u) + a' {u))du 



= €[2(j){t') - 2(j){t) + a{t') ~ a{t)] < 3ne, 



where the last inequality holds due to (S.ll) and the fact that — (/){t) < (/){tm+i) — 
0(^m) = TT. Thus, a'{t) > for all t G [im,im+i], which contradicts the fact that a'{t) 
must change sign inside [trn,,tm,+i] as shown in (S.lOl. 

Then we show that |a(t)| < for all t. Recall that a{tm) = for all m G Z. If there 
exists f G {tm,tm+i) for some m G Z so that a{t') > Jt^, by the mean value theorem 
and ( |S.14[ ) there exists t",t"' G [tm,t'] so that 

a'{t": 



a{t')-a{t^)_ a{t') ^ d^'ji'") ^ ^^^^ 



f — f f — f 

where the last inequality holds due to 

t' 



<j)'{t') TT 



|</)'(t') - </)'(r)| < / |</)"(w)|du < e / ^'iu)du = e(^(t') - < 



(S.16) 



Since ci > 4e, we know 47r[l — ^T^py] > Stt, which is a contradiction. 

Suppose there exists t' so that f3{t') > Aire. Take m G Z so that t' G (smjSm+i)- 
Without loss of generality, we assume t' < tm- Take t G {sm,t') and we have by the 
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fundamental theorem of calculus 

t' 



\m-m\ < \l3\u)\du < e[2c^it') ~ 2cj,{t) + a{t') - a{t)] 
< e[2(/.(s„+i) - 20(.s™) + a{t') - a{t)] < 3^e, 



which leads to, in particular, f3{tm) > 7re. By (S.13), we know 



2Pitm) , , 

y a{tm) + P{tm) 

which contradicts to the fact that a{tm) — 0. The proof is hence completed. 



S.2. Proof of Theorem 



Here, we prove Theorem |2.2[ the multiple-component version of the identifiability theory. 
We need the following lemma which is the rewritten form of Theorem 3.3 in |Daubechies| 
et al. (2010). Notice that, contrary to our setup, in Daubechies et al. (2010) the authors 
considered a broader class of functions Ae,d- 

Lemma S.2.1. Take f(t) = J^iLi ^li^) (^os[2T:4>i{t)] e A^^^f^ and the mother wavelet 
the same as that in Theorem 



3.1 



Then for a G f , we have 



K 



Wf{a,b) - ^ A,(6)e'2.0,(6)yaV^ (005(5)) 



<Ewe, 



K 



idbWf{a,b) - 2^^0;(6)^i(6)e^2.„0,(b)y^^(a0j(5)) 



1=1 



where Ew and are universal constants depending on the moments of ijj and ip' , Ci, C2 
and d. When a € [ , 4>^.^) ] l^/C'^i^)! 0; have 



|c./(a,6)-</.l.(6)| < 



E,., 



2^\Wf{a,b)\ 



where := E^y + 2Trc2Ew- Furthermore, when F > 



l + A 



Remark: Note that there are two error terms in the reconstruction formula. The first 
one Ew ^-^^ e A originates from the definition of A'^^f^, that is, the model bias, and the 

second term comes from the thresholding parameter F. Since ip is smooth and compactly 
supported, the second term might be improved but the improvement is limited. We thus 
choose the current simple bound to make the proof clear and clean. 
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Proof. Denote Zk{b) k ^ 1,...,K. By (Daubechies et al., 2010 

Estimate 3.5), we know that 



Wf{a,b) = \ Mb)e^'-^''^'>^V^iPia<t>',{b)) + Cia,b)e when a € (6) 
' I C{a,b)e otherwise, 

where C{a, 6) e C and 

K 

\C{a, b)\ < c?l^ {Uh)h + + ^Au{b) (ah\4>'m + ^"'^a) }, 



fc=i 



In other words, we have 



VF/(a, b)^Y^ Akib)e'^^^^^^'^y^ip{a(j)'^{b)) + eC{a, b). 



Similarly, we have 



k=l 



K 



1=1 



where 



K 

\C\aM < a^/^^ {0^6)/; + '-^ai;, + 7r A,{b) [al!,\^',{b)\ + '-^a'l',) }. 



fc=i 



By the assumption of A'^^f^, we know that each function in A'^^f^ can have at most 
r in(i+rf)l'n(i-d) 1 components. Thus, when ^ <a< C{a,b) is bounded by 



C2 I- In C2 - In ci 



£.3/2 I ln(l + d)- ln(l - d) 
and C'{a, b) is bounded by 

In C2 — In ci 



C2 r lnc2 -Inci ^ ( ^, ^ I'^ , ^/C2 , C2 



The estimation of w/(a, 6) is the same as that in Theorem 3.3 in Daubechies et al. (2010 1, 
so we skip it. By the above expansion for Wf{a, b) and a direct calculation, we have 



Wf (a, b)a-^/\\Wf ia.b)\>Tda-Ak {b)^^^^- | 



(S.18) 



< 



Zk{b) 



Zk{b) 
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where the first term disappears since /^^.j^) V'^'^i'^^'ki^))'^ ^/^da — TZ^ and the second 



term is bounded by — ^ eA. We simply bound the third term by 



/ \Wfia, 6)|a-'/'x|w,(a,6)|<rda < T / a-^'/^da < 2y^TA 

JZk(b) JZk(b) 



As a result, (S.I8I is bounded by {Ewe + F) ^^^ A, as is claimed 



Next we consider the discretized case. 



Lemma S.2.2. Take f ^ {/(nr)}„ez, where f{t) = Xi^i ^z(i) cos[27r0i(t)] e X'^f^ 
and < T < Yqr^;^ is the sampling interval. Suppose further that Ai G C^(K) and 
supjgjj < for all I — I,. .. ,K. Take the mother wavelet the same as that in 

Theorem 



3.1 



Then for a S \^—^, y;g kave 



K 



1=1 

K 

i5fcT^/(a,6)-27r^0;(6)yl;(6)e*2^"^'WVaV^(a0;(^)) < {Er.w'^^ + Ew')e 



1=1 



where E^ iy and E^ -^y' ol^^ universal constants depending on the first three moments of ip 
and V'', Ci, C2 and d. When a € [ , l^^) ] '^^'^ l^fi'^^ ^)l 7^ Oj have 



\u;f{a,b)~Mb)\ < 



Eta 



2n\Wfia,b)\ ' 

where Er.u '■— E^^ + [Er.w + 27rc2-Er.w]''"^- Furthermore, when F > 0, 



7^: 



l + A 



Wf{a,b)a-^^\^w,(a,b)\>r<ia-Ak{b)^^^*>'^''^ < {[t^Et,w + Ew]e + T)^A 



Remark: The remark after Lemma |S.2.1| holds here. Moreover, notice that when the 
sampling interval r is small enough, the result is essentially the same as that in Lemma 



S.2.1 except the error introduced by the discretization. 



Proof. By the Poisson's formula, since -0 S 5, for a > and 6 G M we have 



nez >- V J 



where means the Fourier transform. Note that Wf{a,b) = '|/(^)^'0(^^^) ^ (0). 

Thus, the difference between Wf{a,b) and Wf{a,b) is X^nsz n^o ^/"^^''('^' where we 
denote 
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We now approximate J2n&z n^o^f^^^\'^^^) ^^^P step. First, we evaluate the 
following integration for n e Z\{0}, k G {1,...,K} and {a,b) g ]R+ x JR. Note that 
Ak{b) cos (2TT{(j)k{b)—b(j)'i.{b))+2TT(j)'i^{b)t'^ can be viewed as an approximation of Ak{t) cos [2n(f> 
at b. 



Ak{b) cos (27:{Mb) ~ b<j)[{b)) + 2^0',(6)t) 
:Ak{b) I cos(27r(0fc(6) - 



a 



^dt 



da; 



:A.(6) / cos(27r(0fc(6)-u))^V, . 

/a \ aq)). (o) 



„(u + b*' (b)) , 



-du 



i2TT , " , 



-du 



(S.20) 



Since r < 



l-A 1 

1 + A C2 



and 



l-A 1 



2A C2 1+A C2 ■ 



l-A 1 



, (S.20 1 becomes 0. 



Before proceeding with the proof, we prepare some bounds which measure the error 
when we approximate cos(27r0(t)) by cos(27r((/)fc(6) — &(/)5j,(6))+27r(/)'j.(6)t)) and so on. Clearly 
when \5\ > 0, we have | cos(a; + S) — cos(a;)| — | J^^'' sin(it)du| < \S\, which leads to the 
following bound when j = 0, 1, 2: 



< 



|(cos(2^0(t)) - cos(27r(0fe(6) - b(l>'^{b)) + 27r0'fe(6)t)))V'S(t)|dt 



cos [27r (Mb) -b(f>'k{b) + 4>',{b)t+ / [4>',{b + u) ~ ^'.(b)] du)] 



t-b 



dt 



W,ib + u)-<P',ib)]du ^Pi'lit)\dt 



1 



\t-bmib)\ + '-^\t-b\')\i,i%t)\dt 



(S.21) 



where in the last inequality we use the fact that e < 1 to simply the bound. Similarly, 
by the Taylor's expansion, e < 1 and the assumptions of A'^'-f'^, we have the following 
bounds: 



/ \Ak{b) ~ Ak{t)\\i^a.b{t)\dt (S.22) 

JR 

<^Jjt-b\ {\^Ub)\ + l^C2\t - b\) |Vv^|dt < e(c2a^/^li"'> + ^ec^a^/^/f ) 



8 Chen et al. 

and 



/ \4>'k{h) ~ 4>'k{t)\\^am\'^t (S.23) 
<£ / \t-h\(\cj,'^{h)\ + ]-tC2\t-b\)\J~^)\dt< e(c2a^/2/W + lec2a5/2/(0)^. 



With the above bounds, we now evaluate the difference between Ak{h) cos{2n4)}^{t)) 
and Ak{b) cos (27r((/)fc(6) — 6(/)'j,(6)) + 27r0'^(5)i). Since V'a.b G 5, by the integration by part 
and the regularity assumption for (t>k{t), we have 



- f Akib) cos (2^(0fc(6) - b<j)'^{b)) + 2n<j)'^{b)t)'^KAt)e^dt 

= Ak{b)-^ I f -27r0;:(t)sin(27r0(i))^Z^ 

- 47r2(0',(t)' cos(27r(/.(i)) - ^(fe)' cos(2^(0fc(6) - 60^^)) + 2^0^^)^)))^^ 

- 47r(0UOsin(27r0(i)) - ^'Jfe) sin(2^(0fc(6) - bcp'^{b)) + 2^0U&)t)))^i'6 W 
+ (cos(2^0(t)) - cos(27r(0fe(6) - bcj^'^ih)) + 27r(/.'fe(&)0))^S Wje^'^'^dt. 



The first term in the last integral is simply bounded by e2'KC2l'i^\ By (S.21) with 



j — 2, the fourth term in the integral is bounded by [^a^/^/j + Rewrite 
cos(27r</.(t)) - cjy'^ihf cos(2^(0fe(6) - 6<^',(6)) + 2^0',(6)t)) = - d^'^{h)m{t) + 

0;(6))cos(2^0(t)) + 

( cos(27r0ft)) - cos(27r(0fc(6) - + 27r0'j^(6)t))) . By ( |S.2l[ ) with j = 1 and 

(S.23), we bound the second term in the integral by e87r^C2 {^2Ci^^^Ii^ + \€C2o:'^'^l'^'^^ + 
e47r2c2[f a^/2/f) + f a^/2j^2)^. sin(27r0(t) (6) sin(27r(<?!)fe(6)-60'j^(6))-| 

27r</.U6)t)) = - sin(27r</.(i)) + 0U^')[sin(27r</.{t)) - sin(2^(0,(6) - 60^(6)) + 

27r(/)'j,(6)t))], (S.21) with j = and (S.23), we bound the third term in the integral by 
e47rc2 (c2a5/2/j°) + iecaa^/^jCo)^ ^. ^4^^^ [f 03/24^) + f a^/^/^i)] . As a result, we bound 

the entire integral by ei?T.iv.</), where < Er^w,<i> < 00 is an universal constant depending 
only on the first three moments of ijj and -0', ci, C2 and d since a € [ , ''^^j • 



Then we approximate the difference between Ak {t) and A^ (b) , again by the integration 
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by part: 



(Akit) - Afe(6))cos(27r(/)fc(t))^a,6(t)e'2"'?dt 



T 



{ A'l it) COs(27r0fc it))iJa,b (t) - 47tA[ {t)^'k it) sin 2TT^k it)iJa,b it) 



+ 2A'fe it) cos 2^0fe (i)V'ili (<) - 27r(Afe (i) - Ak (b))^^ (i) sin 2n(j)k (t)^a,h(i) 



4^20'^ (i)2 (^^, (i) _ (5)) COS 27r0fe it)^a,hit) 



MAkit) - Akib))4>'kit) sm27rMt)i^^alit) 
• iAkit) - Akib))cosi2TTMtWalit)}e-'^^^'it- 



(S.24) 



By applying (S.22) and the same arguments as before, the last integral is bounded by 



^-3/2j^2)j (^f.^g^3/2j(o) _|_ £C2 j^o)^^^ which is bounded by eEr,w,A^ where < Er,w,A < oo 
depends only on the first two moments of ip and tp' , Ci,C2 and d since a S [- 
Note that the condition \A"\ < C2e is used only for the first term. 

In conclusion, we get 



A 1+A] 

C2 ' Ci 



K 



nGZ.n^O 



where Er^w ■= ^iEr.w.rp + Er^w,A) r in(i+d)~in(?-d) 1 ™ Universal constant depending 
only on the first three moments of "0 and ?/;', ci,C2 and d. Note that we use the fact 
that Yl^=i "-"^ — IT function in ^^^/^ can have at most r in(i+d)Iin('i-rf) 1 

components. Eventually, we have 



\Wfia,b)-Wfia,b)\<\Y, W^p'^^ (a, 6) 



< Er,WT^e. 



(S.25) 



By a similar argument, for which the detail is skipped, we have 



\dbWfia,b)-dbWfia,b)\ <Er,w'Th, 



(S.26) 



where E^-yv' is an universal constant depending only on the first three moments of ij: and 
V'') Ci,C2 and d. Combined with Lemma S.2.1 we get the claim. 



1 Chen et al. 



Next, when a e [^t^, ^t^] and \Wf{a, 6)| ^ 0, we have 
ujf{a,b)-<f>'^{b) 

2TrWf{a,b) 

[Afc(6)e'^''W7aV;(a^U^)) " Wfia,b)]^',{b) 
Wf{a,b) 

I dbWf {a,b)-dbWf{a,b)\ + \ dbWf {a,b)~ 2ncp'^ {b)Ak (b)e'^^ ^ij{a^'k {b))\ 



< 



27:\Wfia,b)\ 

|Afe(6)e'^'^'WVa^(a<^U&)) " Wfia,b)\ + \Wfia,b) - Wf{a,b)\U,ib) 



\Wf{a,b)\ 



^ [Ew + Er,w'r^] + 2ttc2[Ew + Er,wT^] 
- 27r\Wf{a,b)\ ' 



Er,i. 



27T\Wf{a,b)\ 



as is claimed. 

By the above expansion for Wf{a, b) and W/(a, b) and a direct calculation as that for 



(S.18), we have 



/ M/y(a,5)a-3/2;^|v^ (,,,)|>rda-Afc(5)e*2.^'=W| 



(S.27) 



< 



7^-lAfe(6)e'2.0.(6) / 7^^(a</,;(6))a-3/2da-^fe(6)e'2.0.(6) 



Zk{b) 



+ n-h [ [T^Er,w + Ew]a-''^^da + n-' [ \Wf{a,b)\a-^/^ 

JZk(b) JZk(b) 



X\Wf(a,b)\<T'do.^ 



where the first term disappears since Jz^ib) V^^^i'^4>ki^))o- ^^^da = TZ^ and the second 
term is bounded by — — k"^"""*"^""^ simply bound the third term by 

/ \Wf{a,b)\a'^^\lw,,^^,\<rda<T f a-^'^^da < 2y^rA. 

JZk{b) ' JZk(b) 

As a result, we get the claim. 



Proof (Theorem 2.2). Choose i/; to be a Schwartz function so that 

1 



■0(0 = e 6xp 



'1-^2 



- 1 



X[l-A.l+A] (0: 



(S.28) 



where x is the indicator function. It is well known that "0(0 is real, smooth, compactly 
supported, monotonically decays when ^ S [1, 1 + A], monotonically increases when ^ S 
[1 — A, 1] and has support [1 — A, 1 + A]. Note that ifj has only one maximal point at ^ = 1 
and ■0(1) — 1. Fix / e -4^^/^. Suppose / has two representations, both are in A 



Ci ,C-2 



N M 

fit) ^^ai{t)cos[2n(f>i{t)] ^^Ai{t) cos[27r$i(t)] G A"^ 



1=1 



1=1 
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Denote Zi{b) -.^[^,1+^], / = 1, . . . , iV, and Yk{b) := 
Then by Lemma S.2.1 we have 



l-A 1+A 



,(6) ' *'Jfc)J' 



] , k = 1,...M. 



K 



1=1 

M 



\Wf{a,b) - ^ A,(6)e'2.*,(b)y^^ („<i,;(5)) | < E^r^e, 



where 



1=1 



p < C2 r lnc2 - Inci 
'"-^cf''ln(l + d)-ln(l-d)'^ 



'C2 



C2 



2ci 



)}• 



and < c < oo is the maxhxium of the first three moments of the chosen ij: ( S.28 ). In other 
words, is a universal constant whose bound depends only on ci, C2 and d. Denote 

N 



L{a, b) := J2 ^i{b)e''^'*"^''^Va$ (a^K^)) , 



i?(a, 6) ^ A, (6)e'2^*' ^''^ (a$; (6)) 



Since the continuous wavelet transform of / is unique, we have 

\L{a,b) - R{a,b)\ < 2E^e. 



(S.29) 



By the profile of the chosen V', there exist N subintervals, /; C Zi, I — 1, . . . , N , around 
jp^ so that on \L{a,b)\ > ai{b)^/2 > ^^i^^fi. Clearly ^7^, 1 = 1 iV, are the 



.... ~ yt^y 

maximal points of |L(a, which are dyadically separated by Similarly there exist 

M subintervals, J; C 11, / = 1, . . . , M, around so that on J;, \R{a, b)\ > Ai{b)^/a/2 > 



/l-Aci 



Also 



1 = 1,..., M, are the maximal points of \R{a, b)\, which are dyadically 
separated by ( |ll[ ). Since is a universal constant, when e is small enough, the equality 
in (|S.29[) cannot hold if M 7^ TV or Zi{b) n y;(&) 7^ for any Z = 1, . . . , /V. Thus we have 



N 



N 



fit) = 5]ai(t)cos[27r^,(i)] - E^'W cos[27r<i>,(0] e 



1=1 



1=1 



Moreover, it is clear from (S.29) that the set n Jk{b) = for all / 7^ k. Also, on the 
set {Zi{b)\Yi{b)) U {Yi{b)\Zi{b)), we have \Wf{a,b)\ < 2E^e for ah / 1, . . . , TV. 

Next, we claim that - $i(6)| < SK^e, |<?!>'i(^) - *i(&)l < 10(1 + d)cl''^Erae and 

|ai(6) - Ai(&)| < 2^E„,e. Note that by (|S.29[), we have on Zi(&) n Yi(6) that 



ai(6)V(a(/''i(6))-Ai(6)V'(a$'i(6)) 



^ 2Em 



and 



(S.30) 
(S.31) 
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since '0 is a real function and e ^ 1. 

It is clear from ( [S3T| ) that 0i(6) = $i(6) + 27rn(6) + 5(6), where n{b) £ Z and |(5(5)| < 
iE^nt for all 6 e M. Since \5{b)\ < iEj^e, 4>i £ and $i € C^, we know n(&) must be 
a constant integer when e is small enough, and hence we can set this constant to 0. As a 
result, we know 



(S.32) 



Without loss of generality, assume ai{h) > Ai{b). When a = 1/0^(6), (S.30I becomes 



aiib)^p{l)-Ai{b)^P 



ai{b) - A,{b)4, 



<t>'i{h) 



< 2^C2E„,e. (S.33) 



Thus, if aiib) — Ai{b) > 2^/c2E„ie, we get a contradiction. Indeed, ai{b) — Ai{b) > 

ai{b) — Ai{b)Tlj (^'jr^^ — \ai(b) — Ai{b)^ (^^X^) | since 1 is the unique maximal point of 

ip. To show the bound of \(j)'i{b) - pick ao £ Zi so that ^jj{ao(l)'i{b)) - '0(ao$i(6)) = 

^{ao4>'i{b) — ao$'j(6)) by the mean value theorem. Notice that HV^'Hl^ > 1 and i\) G . 
Then by the fact that oq G [i^, i±^] , (|S.30| becomes 



Since ai(6) > and -0 > 0, by (S.33) we have 

re - 2^E.^t < (6) - $i(0)) < ; £ + 2^£;„je, 



"i^(</);(6) - $'i(&))| < 2v/c2(l + d)£;,„e + 2y^E^e < 5y^E,ne 



and hence 



< -^^^5V^£;,„e < 10(1 + d)4^^E,„e, 



(1 - A)ci 



where the last inequality holds since < 1 + d. 

Similar argument holds for all A; = 2, ... , K, and we conclude the theorem. 



S.3. Proof of Theorem l3?f] 



To prove Theorem |3.1[ we need the following technical lemma. IT is aimed at adapting 
the delta method, for approximating the mean and variance of the ratio of two random 
variables, to our setup. In particular, when we show the properties of wy, in part (ii) of 
Theorem 3.1 and part (ii) of Theorem |3.2[ we will encounter the problem discussed in this 
lemma. Notice that we only assume the existence of variance but not the higher moments. 
Note that the conditions var(C) < (9/32)'^ and |?/o| > var(C)-'^/^ are not crucial but chosen 
simply to simplify the final bound. 
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Lemma S.3.1 (Delta method). Given two complex random variables C and C so 
that E^' = = 0, var(C) < (9/32)^ and var(C') < oo. Fix two complex numbers xq and 
2/0 so that \yo\ > var(C)^/^. Define C" := Cxc\i3|„j,|/4(-yo)' B^yol/ii-Vo) is the ball of 

radius |?/o|/4 centered at —yo and x is the indicator function. Then we have the following 
relationship: 

' Xo + C \ Xq 



E 



yo 



ei, 



(S.34) 



where ei S C and 



|ei I < ^ ( - 21var(C) + 4v/var(C)var(C')) 

mr ^ yo ^ 



var 



■TO + C 



< 



252 



var(C) + 136 — Vvar(C)var(C') + 21var(C')l • (S.35) 
2/0 ' 



Proof. We start from preparing some bounds for the proof. Denote dF(;/(y), dF^(y), 
AF(<i{y) and AFqi (ii(x,y) the measures associated with C', Ci and the vector (C',C^) 
respectively. 

When Ij/oI > var(C)^^'^, since the variance of C exists and B\y^\i^{—yQ) C 'C\B-^\y^\i^{{S), 
we have 



-S|yol/4(-ao) 



di^C < 



16var(^) 

~^y^" 



(S.36) 



which is bounded by 16var(C)^^'^/9 < 1/2 since we assume var(C) < (9/32)'^. By definition 
and the assumption that EC = 0, we have 



EC' 



yAFc^a (y) 



yAF,{y) _ /B,^^,,,(_,„)2/d^^c(y) 



Since 



/ ydFAy)\ <l\yo\ [ dF, 



c(y) < 



20var(C) 
9|yo| ' 



by (S.36), we have 



|EC"I < 

9|yo| 



(S.37) 



(S.38) 



Similarly, the second moment of is bounded by 



j |ypdi^c"(2/)- 



/c 



WdF^y) 



, ^dFr 



C\B\yo\/ii-Vo) 



\v?dF^{y) < 2varC. 



(S.39) 



With the above preparation, now we come back to prove the Lemma. We have 



E 



yo + C 



Xo 



yo yo 



Vn I 



E 



1 



yo 



l| + — E 

J yo 



(S.40) 
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The first term on the right hand side of (S.40I is bounded by 



E 



1 + ^^ 

Va 



y 



{l-^)AF^n{y)-l 



yo 
1 

yl 



EC 



y^ 



yo 



1 



yo J yo + y 



(S.41) 



1 2 

where the second equahty comes from = 1 — z + when z ^ 1. We simply bound 
the integral in the second term by the Holder's inequality: 



y' 



yo + y 



dF^^iy) 



< 2 



\Si«oi/4(-!/o) y^ + y 

Thus the first term in (S.40I is bounded by 

1 



dFciy) 



< 



16var(C) 



|yol 



E 



1 + ^ 

yo 



< 



21var(C) 



By the same trick, the second term in (S.40I is bounded by: 



1 

|yo 

1 

|yo 



E 



1 + 

yo 



1 



X 



xy 



{X ^)dF(;,_^n{x,y) 

yo + y 



■dF^,^^a{x,y) 

4Vvar(C)var(C0 



|yo| J ' yo + y' mr 

where the inequality holds du e to / xdF^, ,^si{x, y) = and the Holder's inequality. Hence 
we conclude the expectation (S.34). Next we evaluate the variance (S.35). Expand 



var 



Xq 



xo + C 



= E 



yo 



E 



1 + ^ 



1 + ^ 

yo 



yo + 

- 1 I + 23? 



E 



/xo_+C_\ 
\yo + C^J 



ei — 

- yo 



+ lei 



(S.42) 



We rewrite the expectation in (S.42 1 as 

1 



E 



1 + ^^ 



yo 



E 



1 + 

yo I 



2^ 

-m 



I c 



\-E " 

|l+^|2 

vo ' ' yo ' 



(S.43) 



and bound the right-hand side term by term. Rewrite the first term in (S.43 1 as 



1 = 



-23?^ - ^ 



yo 



Vo 



= -23fi( 



\c" \ ^ 




N 1^1' 


1 yo 1 


V yo / 


\ 1 yo 1 


1 + ^ 
yo 


1 yo 1 


^ \l + i^\' 
1 yo 1 
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\i+z\: 



1 + 2 V 1 + zJ 1 



1+z 



- ^ 1+2 \l+z\- 



where we use the equahty 

-1. Thus, by ( |S.37[ ) and ( |S.38[ ), the first term in ( |S.43[ ) is bounded by 

|2 



when 



E- 



1 



|1 



yo ' 



- 1 



— 



1 + ^ 



1 



— 



— 
I 2/0 I 



1 



< 



\yo\ 



\y\dF^n 



. . , ,^^dF.s. +2 

ml J \yo + y\ 



The third term in (S.43) is simply bounded by 

\yo\' 



Next, since 



kop 7 |yo + 2/p 



' yo ' 



yo 



yo 



1 + ^ 

2/0 



16var(C') 



(S.44) 



J/o 



1 + 



yo 



by the Holder's inequality and (|S.44|), the second term in (|S.43|) is bounded by 

yo^y \,r. I ; ^ on Vvar(C)var(C') 



-E- 



^0 1 + 



Va 



2 



yo + y 



yo 



< 80- 



kollyol 



where we use the fact that E^' = 0. As a result, (S.43) is bounded by 



E 



1 



1 



CI: 



1 



^ 85var(C) ^ Vvar(C)var(CO , 16var(C') 



\yo\' 



\xo\\yo\ 



Hence with the bound of ei the variance (|S.35|) is bounded by 



yo + C 



< 127 



'var(C) 



Vvar(C)var(CO _^ jgVar(C') 



\yo\' 



\yo\' 



ei 



Since |yo| > var(C)^/'^ and var(C)^/'^ < 9/32 by assumption, we can roughly bound |eip by 



\yo[' 

and hence conclude 

XQ + C 
2/0 + 



1^ f 1251 ^ l'var(C) + 48| ^ I v/var(C)var(C') + 5var(C')) 
or ^ 2/0 2/0 ^ 



<252 



! var(C) 



136 



Vvar(C)var(CO _^ 21?'^'''^^') 



\yo\' 



\yo\' 



Proof (Theorem 3.1 ). We denote the GRP $cr := (t$ to simphfy the notation. Fix 
a mother wavelet e S. Since cr?/;^'), e 5 for / = 0, 1, the CWT of F, WV(a, 6), and 
—idbWyia, b) are understood as random variables Wf{a, b) + ^cr{'4'a,b) and — z9bM^/(a, 6) + 
^cri—iip'a b) respectively. Note that both ^a{''l'a,b) and ^ai^Pab) are in general complex- 
valued. By assumption we know that for all a > and & G K, E{^cr{ipa,b)) — E($cr(V'o {,)) = 
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0, vaT(^a-{'>Pa,b)) < CO and var(<I> cr(V-'a &)) < °o. Notice that the quantities var(<I>cr(V'o,h)), 
var($o-(^^ J,)) are independent of the time b when a is constant since $ is stationary. When 
a is not constant, var($CT('^i'a,6)) and var($CT('^i'a &)) depend on 6 and we have to handle 
this dependence. 

Step 0: Handling the heteroscedastic term (T{t). By definition 



var[$,(V„,6)] = J |a^a,6(0l'dr?(0, 



where df] is the power spectrum of Fix 6 S ]R. By the assumption of a and the 
integration by part, we have that 

£(0 := ^biO - <^ib)MO 



j {g{x) - (7(6))Va,6(cc)e-»da; < 



(1 + 1^1)'' 



where C; is the universal constant depending on the moments of tpjip' , . . . , ci, C2 and 
d. The last inequality comes from bounding the following term coming from the integration 
by part: 



fe=i 



which leads to 



J |(aV'a,6)«(x)-a(6)V'S(x)|dx 



Thus Q is 



^ ^tS'-k) -'-'^'^'^t''} =■■ Q. (S.45) 

Note that we can bound o by (1 + A)/ci < 2/ci when I < 3/2 or C2/(l - A) < (1 + d)c2 
when Z > 3/2, so C; is an universal constant depending on ci,C2,d and the zeros and first 
moments of k = 1, . . . ,1. As a result, E{^) G C°° and E{^) decays polynomially as 
fast as |^|~'. 
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Thus, by a direct expansion and Holder's inequality we have 

var[$<,(V'a,6)] - c7(6)^var[$(V'a,6)] 



< 



\/cr 



(S.46) 



where we use |'^a,5(OI = W^'^{o,0\^ f^ct that we focus on a e ["4^' ^~^] ' var($(^)) = 
I IV' (01 1 1,2 (M ?7) ~ definition of and the assumption that Cjf := J (iqq^|)2rdr/(0 < oo. 

Denote := and es := CfCn- 

As a result we have the following bound characterizing the influence of a and a: 

var($<,(^„,6)) < CT(6)Var($(V'a,6)) + e2(7{b)e„ + e^el 



(S.47) 



Similarly, by the assumption supp-;/' C [1 — A, 1 + A] we have 

f(*«,6)) = 11^^6(0 iU2(M,,) = a r \ek<)?Mi) 

J Cl 



var 



/ a|V'(c 

Jci 



and hence 



aOrd,y(0 = 47r^c^var($(V'a,6)) = 47r^c^a||V(aC)||i2 



var($^(^/>;_fc)) < a(6)2var($(V';,b)) + e4(j(6)e^ + eje^ 



(S.48) 



where 64 and 65 are constants depending on $, ci,C2,d and the zeros and first moments 
of '^'^^\ k = 1, . . . ,1. To make clear the relationship between a{b) and and simplify the 
notation, we bound va.v{^{tjja,b)) and var($(V'^ j)) simultaneously by 



ma^{var($(V'a,6)),var($«,b))} < {Eia{b) + E2e^f 



(S.49) 



where Ei^ > max{ 



200 S-rr^cpo 



}, 2E1E2 > max{e2,e4}, .£^2^ > max{e3,e5} and Ei and -E2 



are constants depending on ci, C2, and the zeros and first moments of tp^^^ k = 1,. . . ,1. 
To further ease the notation, in the following proof we write 



Ca := *cr(V'a,6) and Ca ■= ^^(V'l.fc)- 
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The proof and result will be done by tracing these two terms so that the interaction 
between cr(6), tcr and e is made clear. 

We comment that in the special case where $ is the Gaussian white noise, that is, 
p — q — Q^ ||'0(a^)||^2(K = can highly simply the representation of var($CT(7/'a_f,)) 

and var(<i>cr(V'a b))- Also notice that when cr(6) is of order e^, all three terms in (S.47) 



and (S.48I are equally important, while cr(6) is smaller than e^, the term in (S.47) and 
(S.48) is dominant. We will keep all these three terms to track the influence of the noise 
level a{h) and the "heteroskedasticity" level t^- Furthermore, although the dependence 



of var(Ca) and var(Ca) on the scale a are made clear in (S.47I and (S.48), to simplify the 



calculation, we bound a by — under the assumption of A 



Step 1: Handling the trend term T{t). When a G [ 



l-A 

C2 ' 



1+A 1 
ci J 



WTia,b) = J f\/^V^(a^)e'2''«''d^ = 



we have 



(S.50) 



since T, as a distribution in general, is compactly supported in ( — yqis'^i' T+S'^i) ^^'^ 
■ip is compactly supported in [1 — A, 1 + A] so that the support of V'(aC) is always away 



from f — ^r-^^c^ , j^cij. Thus, the existence of the trend term do not play a role in the 



l+A 



following analysis since we focus ourselves in the region a G [''^^i ^^!^] ^^^^ region 

Wria, b) = Wria, b) + Wf{a, b) + '^Mai) = Wf{a, b) + <^Ma,b)- 



Step 2: Approximating the random variable WY{a,b). Define Ya^b ■= Wf{a,b) + 
'^aii^a.b) and Xa.b '■= di,Wf{a,b) + ^crii^'ab) simply the notation. Fix 6 e M. Since 
EWV(a, 6) = Wf{a,b), by the Chebychev's inequality we know that 



Pr [\WY{a,b)-Wf{a,b)\ > (Eiaib))^^''} 
By Lemma [S.2.1[ we have 



< 



var(Ca) 



K 



Wf{a,b)-J2 {b)e'^^^' V^f {ac^'i {b)) 



(i?l'T(6))4/3- 



< Ewe, 



(S.51) 



1=1 



which with (S.51 ) we conclude that with probability higher than 1 — ^^]^^^}QjyjT/3^ , 



K 



1=1 



Step 3: Approximating the random variable cjy(a, 6). To evaluate a;y(a, 6) when 
a e Zk{b), we apply Lemma S.3.1 where w e take xq = dbWf{a, b), yo — Wf{a, &)and (2 ■- 

CaXC\B 

have 



\Wf{a,b)\/4. 



i-Wfia,b))- Notice that by (S.36) when |W^/(a,6)| > (£'icr(&)) 



2/3 , ,1/3 



Wf(a,b)\/4,(~Wf(a,b}) 



16var(CQ) 



< 



16{Eia{b) + E2e, 



9\Wfia,b)\^ 9((i;icr(6))2/3 + ei/3) 



(S.52) 



Dynamical Seasonality and Trend 1 9 



Also notice that by Lenima [S.2.1| we have 

To simply the bound, when |W/(a, b)\ > e^/^ and e is small enough, we have 

\u;f{a,b)\<q^',{b) + ^e'/'<2cj^',ib). 



Therefore, when \Wf{a, b)\ > {Ei(j{b)f/'^ + e^/^, by Lemma S.3.1 and taking the assump- 
tion that 4>'f.{b) < C2, we have 



E 



dt,Wf{a,b) + Ca'\ diWf{a,b) ^ 42c2var(Ca) + 4v/var(C)var(a) 



V Wj{a,b) + C^ ) 



Wf{a,b) 



< 



\Wf{a,bW 
(42c2 + 4)(£;ig(&)+^2£a)' 
\Wf{a,b)\^ 



(S.53) 



and 



' dbWf (a, b) + Ca'\ ^ 1008civar(Ca) + 272c2 Vvar(Ca)var(a) + 21var(C) 



V W^/(a,6) + e ^ 



< 



(1008e^ + 272c2 + 21)(£;icr(b) + £26^)^ 
\Wf{a,b)? 



(S.54) 



Thus, by the Chebyshev's inequality and ( |S.54[ ), when \Wf{a,b)\ > {Eia{b))'^^^ + e^^^ we 
have 



Pr 



V Wfia,b) + C^ J 



< 



Wfia,b) + C'J V Wfia,b) + C^ 

(lOOScj + 272c2 + 21){Eia{b) + ^2£a)^ 
((£;ia(6))2/3 + £1/3)2 



> 



I W^/ (a, 6)1 



As a result, by ( |S.52[ ) and ( |S.53| ), we know that when |VF/(a, 6)| > {Eia{b))^/^ + e^^^, 
dbWYia,b) dbWfia,b) ^ {^2 + 4.)iEia{b) + E^e^f , {Eiaib))^^ + e^/^ 



Wvia^b) Wf{a,b) 
with probability higher than 

16(^1(7(6) + S2ea)' 



> 1 



9((i;i(T(5))2/3 + el/3)2, 

{32c2 + 5)HEia{b) + E2e„y 



\Wf{a,bW \Wf{a,b)\ 
(1008c2 + 272c2 + 21)(^ig(&) + E2ea)^ 

((Sit7(6))2/3 + el/3)2 



((Sia(fe))2/3 + ei/3)^ 

In conclusion, when (a, 6) e ^fc(fo) and |H//(a, 6)| > (£'io-(fo))2/3 + ^1/3^ by Lemma 
with probability greater than 1 — (32c2+5) (-Eig(b)+g2eg) have 

((BicT(6))2/3+ei/3j 

K(a,6) - < \ujY{a,b)-ojf{a,b)\ + |c^/(a,6) - (j^'^ib)] 



(S.55) 



S.2.1 



< 



(42c2 + 4:){Eia{b) + E2e,f {Eia{b)f'^ + ei/3 



|Tyy(a,&)| = 



I W^; (a, 6)1 
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and hence conclude the second part of the Theorem. 



Step 4: Reconstruction of each seasonal component. Now we evaluate the recon- 



struction formula: 



/fe(6):=7^^/ / WY(,a,b)x\WY{a,b)\>E^a(b)a ^^^da, 



(S.56) 



which is a complex random variable. We start from preparing a bound. When |W/(a. h)\ > 
|_Eicr(6) and r < Eia{b), since the variance of "^aiipa.b) exists and Br{—Wf{a,b)) g 
C\5i£;ia(b)(0), we have 



< Cor^var(Ca) ^ f 

for some cq > 0. By the Chebychev's inequality we have /c^^i (o) ^^Ca — '^J^f^fsfyi 

2 

On the other hand, the maximum of TW7f^6y|4'^^'^(Ca) is achieved when |W/(a,6)| = 



Eia{b) and r = Eia{b)^ which is equal to We thus conclude that cq < 21 and 



B,(~Wf{a,b}) 



21var(Ca)^ 
\Wf{a,b)\^^ 



(S.57) 



for all \Wf{a,b)\ > ^Eiaib) and r < Eia{b). 



Note that for a fixed a > and 5 € K we have 

V.WYia,b)x\WY{a,b)\>Eia{b)ia) - Wf{a,b)x\Wf(a,b)\>Eia{b){a) 
= Wf{a,b) J {x\Wf{a.b)+y\>Eia(b) ~ X\Wfia,b)\>Eia(b))'iFi^aiy) 

+ j yX\Wf{a,b)+y\>Eia(b)'^FQ^{y)- (S.58) 



To bound the first term on the right hand side of (S.58), we consider two different cases 
for Wf{a,h). When \Wf{a,b)\ > ^Eia{b), X\Wf(a,b)+y\>Eicj{b) - X\Wf(aA)\>Eia{b) 7^ only 
when y e BE,a{b){~Wf{a,b)), that is. 



ix\Wf{a.b)+y\>Eia{b) - X\Wfia.b)\>Eia{b))'iF<:a{y) = / ^FCaiv)^ 

JBE^„it)(-Ws(a,b)) 



which, due to ( S.57 1, is bounded by '^\w'^lfii)\A. ■ Thus, the first term is bounded by ^^^'^^'^'y^p ; 



to bound the second term, since E^a = 0, we have 



yX\Wf{a,b)+y\>E,a{b){a)<iFQ^{y) = / ydi^Ca(y) 

'C\BE,„(b){-Wf(a,b)) 



SEi„(t,)(-W/(a,b)) 
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where the right hand side can be bounded by {\Wf{a, b)\ + Ei(T{b))^^^j^^§^ < p^^^^- 
When |W/(a, 6)1 < ^Eia{b), the first term is bounded by |PV/(a, 6)| and the second term 
is bounded by 



Si.(b)(-W/(a,b)) 



\y\dFcAy) < 



B2Eia{b) (0) 



\y\AF^M<EMb), 



where the last inequahty comes from the Holder's inequality and (S.49). As a result, we 
get 



< 



(S.59) 



Wf{a,b) J {x\Wf(a,b)+y\>Eia{b) - X\Wf(a.b)\>'lEia{b))^PCaiy) 

21var((|'(j)^ 

|p^^(a^5)|3^|W'Ha,6)|>|£;io-(fc) + \Wf{a,b)\xiWfiaM)\<lE,a{b) 



and 



< 



yX\Wfia,b)+y\>Eia(b)'^FCaiy) 

35var(Ca)2 



(S.60) 



|yy^(a^^)|3^|W'/(i,fc)l>fSi<T((,) + Eicr{b)x\Wf{a,b)\<iE,a{b)^ 
which together lead to the following bound for ( S.58| : 

\'EWY{a,b)x\WY{a.b)\>Eicr{b) - Wf{a,b)x\Wfia,b)\>Eia(b)\ 

56var(Ca)' 



< 



< 



\Wf{a,b)\'' 



;X|W/(a,6)|>f _Bicr(6) + '^Eia{b)x\Wf{a,b}\<^Eia{b) 



\Wf{a,b)\>'iEia{b) + •^X\Wf{a,b)\<'iEia{b) 



var(Ca)^ 



Note that we choose this rough bound simply to make the proof clean. By exchanging the 
expectation and integration in (S.56) we have 



Jz.Jb) 



< 



Zk(b) 

20 {Ei<j{b) + E^e,)'' 



b)\>Eia(b) 



da 



rdo 



< 



[EMb)f Jz.^b) 
40 {E^a(}3) + E^e„Y vTTA - yTTA 



< 



■R^ {Eia{b)f 

40 {Eiajb) + E2ea)^ 
Cl7^v, {Eiaib))^ 



Mb) 



(S.61) 



where we use ci < 4>'j.{b) < C2- 



Next we evaluate the variance of By definition, TZ'^va.rfk{b) becomes 

/ / T^{WY{a,b)x\WYia.b)\>Ei<7{b) -'EWYia,b)x\WY{a,b)\>Eiaib)) 

JZk[b) JZk{b) 

X (WV(a', b)x\WY(.a'.b)\>E^a{b) - EWy (a', b)x\WY(a' ,b)\>Eia{b)) {aa'y^/'^dada' , (S.62) 
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where the expectation inside the double integral, by definition, becomes 



Wf{a, b)[x\Wf{a,b)+y\>Eicy{b) " X| W/ (a,6) | >£;i<T(b) ) 
X Wf{a' , b){x\Wf(a' M)+y'\>Eia(b) " X\Wf(a' ,b)\>Eia{b))'^FY^,Y^, iv, u') 
+ J Wf{^jb)[x\Wf(a,b)+y\>Eia-{b) - X\Wf{a,b)\>Eia-{b)) 
X y'X\Wfia',b)+v'\>Eia(b)dFY^.Y^, {V , v') 



+ j Wf{a' ,h){x\Wj(a' ,b)+y'\>Eia(b) - X\Wf(a',b)\>Eia{b)) 

X VX\Wf(a.b) + y\>Eia{b)'i'FY^,Y^, {V , v') 
+ / yV'X\Wf{a,b)+y\>Exa(b)X\Wf{a' ,b)+y'\>Eicr(b)'^FY^,Y^,{V^y') 



(S.63) 

(S.64) 

(S.65) 
(S.66) 



E^(&)-7^,: 



Zk(b) 



Wf{a,b)a ^/^X|W'/(a,b)|>£;i<T(b)da 



We simply bound (S.63), (S.64), (S.65) and (S.66 1 by the Holder's inequality. We prepare 



two simple bounds. First, by the same argument as that for (S.59), we have 

yi \Wf{a,b)\^\x\Wf{a,b)+v\>Et(T{b) - X\Wf{a,b)\>EicT{b)\ di^c 
- 1^^/(0, 6)1 ^ \Wfi°'MX\Wf(a,b)\<^E^a(b)■ 

Second, we simply bound 

X\WJ{a,b)+y\>E^a{b)\y?'i■Fc_Sy) < Var(Ca) 



(S.67) 



(S.68) 



By the Holder's inequality and (S.67), (S.63) is bounded by 

(|]:^7^(~^^|H'/(a,6)|>fi5i<T(b) + Ws{a,b)\x\WJ{a,b)\<lE^a{b) 



■(fc) 



Similarly, by (S.67 1 and (S.68 1, (S.64) and (S.65) are together bounded by 



|^^(a^5)| X|w,(a,b)|>igi.(fc) + HX\Ws{a.b)\<\E,.{b)) v/var(Ca) 



We simply bound (S.66) by ^ var(Ca)var(C£i/ ) by the Holder's inequality and bound the 
last term by (S.61). Putting the above bounds together leads to the following bound of 
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nl^&vfkih) in (S.62): 



Z (b) (|M^y(a,6)|^l^^("''')l>5'Bl'T(fc) + \^f(^MX\Ws(a^)\<lE^a{b) 



.13 3. 2 (^1^(5) +^26.)^ 

- L 3 2^ {Eia{b)Y 

-'''^ {BMW 



Zk{b) 



(S.69) 



As a result, by (S.69) and the Chebychev's inequaUty, we know that with probabiUty 
greater than 1 — 7~^, where 7 > 1, /fe(6) deviates f rom f k{b) with err or bou nded by 
T^^^lV70c2 ^^^'^E^aib)'^''^ Combining this with (S.61) and Lemma S.2.1 we conclude 
that with probability higher than 1 — 7^^, 



Eia{b) 



7V70c2 + 



+ {Ewe + Eia{b))^^A. 



The proof is thus finished. 



S.4. Proof of Theorem 



The proof is essentially the same as that of Theorem 3.1 once we get the following approx- 
imations specific to the discretization. 

Step 1: Approximating the discretized signal By the Poisson's formula, given 
T := {T{nT)}nez, where r > 0, for a > and 6 G M, we have 



W.,.M-rj:TUr>^X^) - j:^{n,l.jC-i)} (2). (3.70) 



Note that Wria^b) = -7^ } (0) = when a € [^,^]- Thus, in 

order to analyze the error introduced by the finite sampling, we focus on analyzing 
Enez,n^oW^"^^\a,b), where we denote W^''^^\a,b) := J" |t(0^^(^^) | (^). By 



the Plancheral's Theorem: 



W|"/"^(a,5) := / T{b)^^P 



t-b 



a \ a 
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for all n 7^ since r < ^ and T is compactly supported. Thus, when a G [^^^, ^'ET'I ' 
we have 

Wria, b) = Wria, b) + ^ W^"/^' (o, 6) = Wt((x, 6) = 0. 

rieZ,ri/0 

In other words, when the sampling rate is high enough, we can ignore the trend component 
of the observation and focus on the seasonal components. In the case that the sampling 
rate is not that high, the aliasing effect will show up and extra error term will come into 
play. 

Step 2: Estimate WY{a,b), ujyia^b) and the reconstruction. To finish the proof, 
we need the following approximations. First, the CWT of Y becomes 

= Wf{a, &) + T ^ a{nT)^„xpa{nT - b). 

Similarly, we have 

nez 

= d,Wf{a,b) + TY,<nT)^n^^^\nT-b). 

Denote 

^a{a) := T ^ G{nT)^n'4}a{nT - b), *(a) := r ^ ^n'^ainr - b) 

neZ neZ 

and 

*;(a) := T ^ ainT)^ni^i'\nT - b), *'(a) := r ^ $„Vi'^(nT - b). 
nel, nez 
Note that V^ctCo) and "if'cria) are both complex random variables. Clearly £^^(0) = and 
E\I>J^(a) = 0. To evaluate variances of ^'cr(a) and ^'^{a), recall that the covariance function 
7(n) := EXnXo is a positive definite sequence. Thus, since if} € S and X is stationary, we 
are allowed to exchange E and X^nez have 



var(*cr(a)) := r^E ^ (j{nT)Xn'ipa{nT - 6) ^ a{mT)Xmi>a{mT - 6) 

nGZ mGZ 

= ^ EX„X„(T(nT)V'o(nT - b)a{mT)il>a{mT - b) 
= ^ ^{n — m)a{nT)^l)a{nT — h)a{rnT)'il)a{rnT — b) 

where the last equality comes from the Bochner's theorem. Similarly we have 



var(*(a)) = / |r V ^a{nT - 6)e-™« 'd/x(0. 
•^0 ' „ez 
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Since ip € S, hy the Poisson formula, we have 

T J2 ^(«T)Va(nr - 6)e-^"« = - fe)} 



where T means Fourier transform. By the same argument as that for Lemma S.2.2 and 
by the assumption for ct, we have 



and hence 



2t:t 



|var(*^(a)) - cr(6)2var(*(a)) I < T'^er,ia{b)e^ + r^'e^^zfa 



where e-rfi, i and 6^- 2 are constants depending on ci, C2, d and the first two moments 
of V and V'- Since < r < (jq^^, a e ^] and = we have 

^ > 1 + A and the following bound: 



var( 



dM(e) 



■00 





where we use the fact that 



\2tttJ 



d^(^) = a 







2'kt' 



d/i(^) < ^ max 



1-A 1 + A 



V 27rr / 



iv^(^)rdMo 



< 



Cl 



As a result, we get 



var(^'^(a)) < —9rCr{bf + T^er,icr{b)€„ + T^er.2€l 

Cl 



Similarly, we have 



var(^'(a)) = 



27r 



2nT 



a 



27rr(l + A) 



Zttt 



and 



var(^';(a)) < 167r2c2(l + d)9r<7ibf + T^er,3a{b)e^ + T^e^^^tl, 

where Ct-.s and e,-,4 are constants depending on ci, C2, d and the first two moments of V' 
and ■i/''. As in the proof of Theorem 3.1 to make clear the relationship between cr(6) and 
60- and simplify the notation, we bound var(5'o-(a)) and var(^'Jj(a)) simultaneously by 

max{var(*^(a)),var(*;(a))} < {Er,io{b) + T^Er.i^af , 

where E'r,!^ > max{^, 167r^C2(l+(i)0T}, '2Er.iEr.2 > max{eT-,i, er^a}, £^r,2^ > max{eT-,2,eT 
and Er^i and -Br,2 are constants depending on 5*, ci, C2, d and the zeros and first moments 
of V' and ip' . 

With the above estimations and Lemma [S.2.2[ we can finish the proof by following the 
same line as that for Theorem |3.1| Since the proof is exactly the same, we will skip it. 
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S.5. Discrete- and continuous-time Autoregressive lUloving Average processes 

S.5.1. Discrete-time Autoregressive Moving Average Processes 

Take a time series Xt. Denote B as the lag operator, that is, BXn = Xn-i. Denote Wt as 
an white noise, i.e., an independent random variable with mean and variance a^. Note 
that uJt might have fat tail or the variance of wt might not be finite. An Autoregressive 
and Moving Average Process of order (p, q), denoted as ARMA(p,q), where p,q € NU {0}, 
is defined as a complex- valued strictly stationary solution y„, n G Z, of the difference 
equation: 

cj){B)Yt = eiB)ujt, (S.71) 
where 4>{-) and 9{-) are polynomials defined as 



(z) = 1 - ... - 0p^P, O{z) = l-0iz- 



with £ C, 9j £ C for i — 1, .. . ,p and j — 1, . . . ,q. In Brockwell and Lindner (2010) 



the following result was provided: suppose a;„ is a nondeterministic independent white 



noise sequence. Then (S.71) admits a strictly stationary solution y„ if and only if (i) all 
singularities of 9{z)/(j){z) on the unit circle are removable and Elog"*" |a;i| < oo, or (ii) all 
singularities of 0{z)/4){z) are removable. When < oo, the power spectral density d/i(^) 
of the strictly stationary solution Y„ satisfies 



S.5.2. Continuous-time Autoregressive Moving Average Processes 

In this subsection we recall the basic definition of the continuous-time autoregressive and 
moving average (CARMA) process. CARMA of order (p, q), denoted as CARMA(p, g), has 
been studied in [Brockwell (200l ); Brockwell and Hannig (2010). Consider the following 
stochastic differential equation 



a{D)Yt = abiD)DWt, 



(S.72) 



where a{z) = z^ + aiz^ + . . . + Up is the autoregression polynomial which is nonzero on 
the imaginary axis, b{z) — bo + biz-'r . . . + bqz'^ is the moving average polynomial, cr > 0. In 



Brockwell (2001| ) , the case p > q was considered and the solution is a stationary random 
process in the ordinary sense. The strictly stationary solution is unique Brockwell (2001| ) ; 
Brockwell and Hannig (2010) if all singularities of the meromorphic function b{z)/a{z) 



on the imaginary axis are removable. Also the necessary and sufficient condition for the 
existence of the CARMA(p, 0) process is that a{z) has no zeros on the imaginary axis. 



In Brockwell and Hannig (2010), the constraint p > q was removed by taking the 
generalized random process (GRP) Gel'fand and Vilenkin (1964) into consideration]^ 
Indeed, with the assumption on a{z) of order p > and b{z) or order g > 0, where p may 
be less and equal to g, the CARMA(p, q) GRP Y is defined as: 



Y 



6,a:(j) 



y-9 



b.W^i- 



if p > 
if p = 



(S.73) 



^Note that the standard Guassian white noise is CARMA(0, 0) GRP with a[z) = 1 and b{z) — 1, 
which is a special p = q case. 
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where 



i-i 



k—p— 1 

where is the unit p- vector with the i-th entry 1, 



(S.74) 



A 










1 







1 








1 

-ai 



11 



a{z) 



^p-l]T 



a{z) 



-dzx(_oo,o](i), 



where ji and are close curves in C enclosing the zeroes of a(-) with strictly negative 
and strictly positive real parts, respectively. We call A the companion matrix of the 
autoregression polynomial 0(2). g can be understood as the kernel of the CARMA process, 
which is clear when p > q. Notice that when p > q, Y is causal when the real part of the 
eigenvalues of A are all negative, which can be seen from (S.73) Brockwell (2009). Also 
notice that when p > q, the summation term in (S.74 1 does not exist; when p = q, the 
summation term is reduced to W^'^^^ the standard Gaussian white noise. The algebraic 
forms of the residue of e^*[l z ... zP~^]'^ / a{z) at the zeros A of a{z) are 

git) 



X:3t\<0 k=l 



^p-l]T^zt. 



-Hz)] 



2m{ij{X) 



1)1 



-X[0,oc){t) 



*a-i(z)] 



A:5RA>0 fe=l 



2m{ii{X) - 1)! 



p(A)-l 

^ J2 aAfct''e^*X(o,oo)(i) 

A:SR:A<0 k=l 



A:sr:a>o fe=i 



/.(A)-l 

^ /3Afc^''e^*X(^oo.o)W 



X(-oo,0] (t) 

(S.75) 



where ^(A) is the multiplicity of the zero A, a\k,i3xk G Note that the entries of g{t) 
are bounded and decay exponentially. Also note that the term J2k^p-i W' ^~''^ef A'^e^ in 



(S.74) exists only when q > p. Moreover, there exist vectors £{0) and r(0) (Brockwell and 



Lindner, 2009 Proposition 3.2) so that 

g = e^*^(0)x[o,oo) W - e^*r(0)x(-oo,o](0- 



(S.76) 



The relationship between the CARMA (p, g) GRP defined in (S.72) and discrete-time 
ARMA process is discussed in Brockwell and Hannig (2010) and we briefly state it here. 
Define <j){z) := a{6~^{l — z)) and 9{z) := b{S~ (1 — z)) , where 5 > is small enough so that 
none of the zeros of (f>{z) lies on the unit circle. Let (Kn)n6Z be the stationary solution to 
the following ARMA equation: 
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where Z„ is i.i.d. Gaussian with mean and variance 1. Define a new GRP Y^ := ^[4/5]- 
It is shown in Brockwell and Hannig (2010 1 that converges to Yt in the sense of finite 
dimensional distribution. In this sense, we can approximate a CARMA(p,q) GRP by a 
ARMA time series in the distribution sense. 

We mention that the innovation process Wt can be generalized to the Levi process, as 



is discussed in Brockwell and Lindner (2009). To simply the discussion, we focus ourselves 



on the Brownian motion case. Also the CARMA can be generalized to the fractal CARMA 
process by introducing the fractal integral part. Again, to simplify the discussion, we focus 
ourselves on the CARMA case. We mention that the proof for Theorem |3.1| and |3.2| for 
the Levy-driven CARMA process can be carried out in the same way when p > q. 

Below we show the power spectrum of a given CARMA(p, q) GRP so that we can see 
how it depends on the scaling of the measurement function. 



Lemma S.5.1. Fix a CARMA(p,q) GRP $, where p,g e N U {0}, defined in (S^ . 
For ip £ S, we have 



var<I>(i/'a) = 
where X are the roots ofa{z), 



IE7 



i2TT^/a + A 



q i~l 



i—p k—p—1 



b,e{A''ep{2TTi^) 

f-,i—k— 1 



and c?o.o(a)C) = 1- ^'^ particular, the power spectrum of ^ is 



i2< + A 



de 



Remark: Note that when q < p, A-q decays as fast as |^|~^, so := /(I + 
is finite when ^ > 1. When q > p, dp^q is a polynomial of degree q ~ p. In this case, 
/(I + lCl)^^'d77 is finite when 21 > q — p + 2. In conclusion, CARMA(p,q) process 
fits our Theorem. 



Proof (Proof of Lemma S.5.1). Take ip eS. By (S.73I, we have 



1 i-i 

j=0 k—p—1 



= Y.l^,eU' Ji9^MmW{t) + Y^ X i-iy-%e^A'^epW{i^i';'^), 

j=0 j=pk=p-l 

where g(t) — g{—t) and W{^''J^''^) is a complex Gaussian random variable by the as- 
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sumption. By a direct calculation we know 



j=p fc=p-i 

9 



(S.77) 
(S.78) 



3=0 •' 



j=p k=p-l 



Note that when p > q, (S.78 1 and (S.79I disappear. Recall the fact that 



EW{f^^^)W{gW) = / fgdt 



(S.79) 



(S.80) 



when f,geS. By (S.80), (S.77) becomes 



1 _ r 

^ b,bk / elA^{g*iba.b){t)e'[A'^{g*ija^b){t)dt 

j,k=o 

^ bjhtr{{A^feie{A^ / (g ★ Va,6)(i)(g * V-a.b) W^dt}, 



(S.81) 



where in the last equality we use the fact that v^wu^w = ti{vu^ wuF) when v,w,u £ C. 
Since each entry of g is bounded, C°° except at and exponentially decay, each entry of 
g * is a Schwartz function. 

Thus, by the Plancheral theorem the integral in (S.81) becomes 



ig*'>Pa,b){t){g*^Ja,b){t) dt = / g^^Ja^iOg^i^aAO dS, 



gio^'aAmo^Mm = « J di-m-o^mciOM, (s.82) 

where the last equality holds due to g(^) = g(— C) and "tJja.biO — \/ ai^{a'C)'^~^'^^^^ ■ By 



plugging (|S.82| into (|S.81|), E 



becomes 



E hMT[{A^)'^e,elM I g(-O^RF«l^(aOl'd^} 
^ / efA^g(-O^P^^R)a|^«)l'dC, 

-.I. n ^ 



(S.83) 



j,k=0 
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By a direct calculation, we know when sign(3fiA) < 0, the Fourier transform of e'*'*X[o,oo) 
is ^27r{-A ' '^tisii sign(3?A) > 0, the Fourier transform of e'^*X(-oo,o] is a^its^^ ■ Thus by the 
assumption that the eigenvalues of A are all simple, we know that 



(S.84) 



where 7;^ = a.x when 3fiA < and = when 3fiA > 0. Plugging ( |S.84[ ) into ( |S.83[ ) 
leads to 



\ .. J n ^ 



X.jj, j,k—0 

Y y 

^ X -I- iOirf: In ^ 



(A + i27r^)(/i-f27rC) 



(S.85) 



where c\ := X]?=o ^J^i'^'^'^A ^ C. Next, (S.78) becomes 



i— p k—p~l 
q i-1 j-1 



b ' 



ij—p k—p—1 l—p—1 
q i-1 j-1 



= E E Y.(-^y^''"Ab-^^'^^"^vh,elA^e,j4>tb"^^^ (S.86) 

i.j—p k—p— 1 l—p— 1 

where due to V'a^^(0 — (^27r^)^>/a^(a^)e~*^^^^ the integral in the last equality becomes 



(-i> 



and thus (S.78 1 becomes 



K,g{a,0\'\m\^d^- 



(S.87) 



Similarly, by ( |S.80[ ), ( |S.79[ ) becomes 

9 



E 



g 9 i-1 



[E&.ef^^/(9*v^a..)(t)dM/w][E E (-i)'-'&ief'^'=epmv'i!: 



= EE E (-l)'-'^'iel^^^-eAefA^E[(y 9*^a,bWdT4^(t))mC;'')_ 

j'— l—p k—p—1 

= EE E {-l)'-''-\elA^e^b,elA= j g * Va,6 W<^'"'^i)di. (S.88) 



" ' ' 1 
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By the same arguments as those for (S.81), the integral term in (S.88I becomes 



l-k-l / cl-k-l 



{-27Ti) 



l-k-l 



l-k- 



7a 



and (S.88) becomes 

q q l-l 

-EE E 

j—O l—p k—p—1 
q l-l 



(27ri) 



l-k-l 



-,1-k-l 



Cl-k-l 



l—p k—p—1 

/ c?p,g(a>OE 



A j=0 



(S.89) 



With (S.85I, (S.87) and (S.89), var$(V'a,&) becomes 



J I ^ 127:^/ a + A 



(S.90) 



Thus, by polarization, we know that the unique positive tempered measure associated 
with $ is 



(S.91) 



Notice that dp q{a, = when p > q, and by definition we have 



E 



i27rf + A 



i27r£ + A ^ ^ 1 
A ^ j=o 



As a result we have 



d^(C) = |E 



CA 



i27r^ + A 



dC = 



dC, 



which coincides with the result in Brockwell (2001 ). 



